-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RL/CORE-8480 #1
RL/CORE-8480 #1
Conversation
There seems to be some whitespace issues. I can make a separate PR converting the entire project to use just spaces or tabs. |
- name: Set up JDK | ||
uses: actions/setup-java@v2 | ||
with: | ||
java-version: 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you pick Java 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pretty much just copy / pasted that file from LowMemoryAhoCorasick. I believe that repo was made Java 8 because we exported the unpacker stuff with a target of Java 8. Don't know if thats still the case at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah unpacker runs with java 17 but compiles/builds for java 8 (because DSM/DME requires it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my third party projects run on java 11 but compile/build for java 8 btw (so you dont have to enforce necessarily setting java to 8 here)
see https://github.com/pkware/AttributeCachingFileSystem/blob/main/file-attribute-caching/build.gradle.kts and the kotlin block using javaToolChains and https://github.com/pkware/AttributeCachingFileSystem/blob/main/build.gradle.kts using the jvm target to hit 1.8 there (ie java 8)
import kotlin.math.max | ||
import kotlin.math.min | ||
|
||
class PKWareTests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class PKWareTests { | |
class GenerexTestKotlin { |
I hate it, but if we're contributing this back with Kotlin in it, this seems like the way.
My real stance is it's probably best to settle on rewriting these in Java (🤮 , I know)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Project seems to have been dead for years, so don't currently plan on contributing back, tis part of the reason I said f it and changed the build system. Can still rename if want though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I last talked with @MatthewFoster624 , the plan was to see if there were active forks and base our changes off of that in hopes of finding something worthy of our contribution.
If we're just keeping this to ourselves, I guess it's fine but I'd still think a rename would be better even if it's a nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to go with the rename. Even if it's a dead branch and we don't fork back to it, we are keeping this open source (as it should be).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Rename (keeping Kotlin is fine)
/** | ||
* Generate and return a random String that match the pattern used in this Generex, and the string has a length >= | ||
* <code>minLength</code> and <= <code>maxLength</code> | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the indentation is a little off here to line 462
assertThat(ratio).isLessThan(1.1) | ||
} | ||
|
||
companion object { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be private?
|
||
@After | ||
public void tearDown() throws Exception { | ||
for (String result : new Generex(pattern)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is off here
iterator.next(); | ||
} | ||
iterator.remove(); | ||
for (String result : generex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation of this block is off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix the formatting issues in a separate PR. Believe its because the project currently uses a mix of tab and space characters.
… to JUnit5, and by updating the minimum Java target. Removed custom Iterator / Iterable classes. Migrated from a POM to Gradle build.
…exes. Lengths of strings produced to match a regex more are now more likely to follow a uniform distribution. Transitions are now weighted by the number of characters that can be used. This will produce a wider variety of strings on average. Fixed issues relating to regex escape sequences.
7dbce00
to
164ea37
Compare
Review this PR commit by commit.
The first commit is mainly busy work getting Gradle set up and migrating from JUnit 4 to 5 + introducing Google Truth. The gradle files were pretty much just copied from the LowMemoryAhoCorasick project. I don't particularly care if everything is exactly up to date on that stuff. The repository we forked from seems to be dead so I have no qualms changing the build system if it makes it more inline with our other open source projects.
The remaining commits attempt to make the regex generation process more intuitive by giving concrete details on what happens when a regex can't generate a string in a requested range, and by increasing the likely hood that each potential string has an equal probability of being generated. That wanted behavior is heavily dependent on the regex. A bug which could easily cause a stack overflow exception was fixed, along with a bug that caused regex escape sequences to be handled correctly.