Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query: IndexWriter.LongIndex reverse should also use a threadsafe cache? #47

Closed
eclipsewebmaster opened this issue May 8, 2024 · 4 comments

Comments

@eclipsewebmaster
Copy link

| --- | --- |
| Bugzilla Link | 582823 |
| Status | NEW |
| Importance | P3 normal |
| Reported | Dec 31, 2023 17:04 EDT |
| Modified | Jan 02, 2024 14:39 EDT |
| Version | 1.15 |
| Reporter | Jason Koch |

Description

I notice that IndexReader.LongIndexReader::reverse uses a ConcurrentHashMap to track the binary search cache. I see similar code in IndexWriter.LongIndex but it hasn't been modified to use a CHM.

I think it should be swapped, and have a very similar patch running on our local branch.

Was there a reason for this? Happy to submit patch for it to use CHM.

@eclipsewebmaster
Copy link
Author

By Andrew Johnson on Jan 02, 2024 04:22

Do you know where IndexWriter.LongIndex.reverse() is used?
It wasn't changed because it didn't seem to be used in a multi-threaded way.

We could go through the API working out which ones are thread-safe, but might need input from the adopter community to see what is important.

@eclipsewebmaster
Copy link
Author

By Jason Koch on Jan 02, 2024 14:38

Do you know where IndexWriter.LongIndex.reverse() is used?

Good question. I modified the code to throw an exception, and only and 8 tests failed:

  • longIndexCollector4[index size 6]
  • longIndexCollector4[index size 999,999]
  • longIndexCollector4[index size 1,000,000]
  • longIndexCollector4[index size 1,000,001]
  • longIndexCollector4[index size 1,999,999]
  • longIndexCollector4[index size 2,000,000]
  • longIndexCollector4[index size 2,000,001]
  • longIndexCollector4[index size 1,000]

So it would appear the code is currently unused in any of the test cases.

Further, when I look at the code paths.

  • LongIndex is abstract static class.
  • IndexWriter.LongIndexCollector extends it and uses the LongIndex impl
  • IndexWriter.LongIndexStreamer (and PosIndexStreamer) extends it and uses the LongIndex impl
  • IndexReader.LongIndexReader extends it and provides its own reverse() implementation

As far as I can tell, the writeTo() all return a LongIndexReader which has its own impl. The LongIndexCollector and LongIndexStreamer are never used with a reverse() call. Identifier::reverse is also called but it has its own implementation too!

We could go through the API working out which ones are thread-safe, but might need input from the adopter community to see what is important.

On one hand, I think upgrading something not-thread-safe to thread-safe seems wise. On the other, Hyrum's Law is also at play here, given majority of the parser classes are not thread-safe and only being made thread-safe as a result of demands from hprof parser.

I'm only aware of one extension (Mat-Calcite-Plugin), and one app over the top (JIFA) that rely on MAT. Is there a catalogue available?

@eclipsewebmaster
Copy link
Author

By Jason Koch on Jan 02, 2024 14:39

So it would appear the code is currently unused in any of the end to end / integration test cases.

@jasonk000
Copy link
Contributor

Recommend to close as @krumts

@krumts krumts closed this as completed May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants