Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2030 : all new from createComponents() #3040

Merged
merged 1 commit into from
Feb 17, 2020

Conversation

idodeclare
Copy link
Contributor

Hello,

Harmful re-use from createComponents() seems to be the actual problem. Let's see if this fixes that issue.

—CF

@vladak
Copy link
Member

vladak commented Feb 14, 2020

Tried the test from #2030 (comment) , seems to do the trick.

@vladak
Copy link
Member

vladak commented Feb 14, 2020

In general this looks good. Could you rebase after #3041 goes in to see if the tests are clean ?

@vladak
Copy link
Member

vladak commented Feb 14, 2020

What about HistoryAnalyzer's use of JFlexTokenizer in its createComponents() ?

@idodeclare
Copy link
Contributor Author

What about HistoryAnalyzer's use of JFlexTokenizer in its createComponents() ?

No harmful reuse there

@idodeclare
Copy link
Contributor Author

Oh I need to make a slight improvement

@idodeclare idodeclare closed this Feb 15, 2020
@tarzanek
Copy link
Contributor

interesting ... @vladak will you retest?

@vladak
Copy link
Member

vladak commented Feb 17, 2020

@tarzanek sure, TaaS :-)

I think a test case should be added.

@vladak
Copy link
Member

vladak commented Feb 17, 2020

For the record, what was the improvement ?

@vladak
Copy link
Member

vladak commented Feb 17, 2020

For the record, what was the improvement ?

I see, using method reference:

<<<<<<< HEAD                                                                    
        this.symbolTokenizerFactory = createPlainSymbolTokenizerFactory();      
=======                                                                         
        this.symbolTokenizerFactory = this::createPlainSymbolTokenizer;         
>>>>>>> c9160011b169e04969ce098c2253c452aef6b200

however I fail to see how this can work.

@tarzanek
Copy link
Contributor

tarzanek commented Feb 17, 2020

so the fix seems to be the use of functional supplier to provide symbol factory:

Supplier<JFlexTokenizer> symbolTokenizerFactory;

how exactly does it work in detail, I'd love to know too
(functional means thread safe locking + immutable objects where possible, but not sure what exactly the Supplier uses to stay thread safe)
@idodeclare ?

@vladak
Copy link
Member

vladak commented Feb 17, 2020

Anyway, the updated fix works so merging.

@vladak vladak merged commit 9cd38b9 into oracle:master Feb 17, 2020
@idodeclare idodeclare deleted the bugfix/tokenizer_factory branch February 17, 2020 23:54
@idodeclare
Copy link
Contributor Author

so the fix seems to be the use of functional supplier to provide symbol factory:

Supplier<JFlexTokenizer> symbolTokenizerFactory;

To clarify, it's a functional supplier of new tokenizers.

how exactly does it work in detail, I'd love to know too
(functional means thread safe locking + immutable objects where possible, but not sure what exactly the Supplier uses to stay thread safe)
@idodeclare ?

@tarzanek , it's thread safe because there is no sharing of data until the output of the supplier is passed along to Lucene after the function returns. No locking required. The object is not immutable but we rely on Lucene also not sharing the object among threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants