Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL/CORE-8480 #1

Merged
merged 3 commits into from
Aug 7, 2023
Merged

RL/CORE-8480 #1

merged 3 commits into from
Aug 7, 2023

Conversation

RyanLuMaye
Copy link

Review this PR commit by commit.

The first commit is mainly busy work getting Gradle set up and migrating from JUnit 4 to 5 + introducing Google Truth. The gradle files were pretty much just copied from the LowMemoryAhoCorasick project. I don't particularly care if everything is exactly up to date on that stuff. The repository we forked from seems to be dead so I have no qualms changing the build system if it makes it more inline with our other open source projects.

The remaining commits attempt to make the regex generation process more intuitive by giving concrete details on what happens when a regex can't generate a string in a requested range, and by increasing the likely hood that each potential string has an equal probability of being generated. That wanted behavior is heavily dependent on the regex. A bug which could easily cause a stack overflow exception was fixed, along with a bug that caused regex escape sequences to be handled correctly.

@RyanLuMaye
Copy link
Author

There seems to be some whitespace issues. I can make a separate PR converting the entire project to use just spaces or tabs.

- name: Set up JDK
uses: actions/setup-java@v2
with:
java-version: 8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you pick Java 8?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pretty much just copy / pasted that file from LowMemoryAhoCorasick. I believe that repo was made Java 8 because we exported the unpacker stuff with a target of Java 8. Don't know if thats still the case at this time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably it doesn't matter that much. Just figured we'd push for 11 or 17 if we can.
@mbayerPK / @PKJosh Is Unpacker still driving a java 8 baseline?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah unpacker runs with java 17 but compiles/builds for java 8 (because DSM/DME requires it)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my third party projects run on java 11 but compile/build for java 8 btw (so you dont have to enforce necessarily setting java to 8 here)

see https://github.com/pkware/AttributeCachingFileSystem/blob/main/file-attribute-caching/build.gradle.kts and the kotlin block using javaToolChains and https://github.com/pkware/AttributeCachingFileSystem/blob/main/build.gradle.kts using the jvm target to hit 1.8 there (ie java 8)

import kotlin.math.max
import kotlin.math.min

class PKWareTests {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class PKWareTests {
class GenerexTestKotlin {

I hate it, but if we're contributing this back with Kotlin in it, this seems like the way.

My real stance is it's probably best to settle on rewriting these in Java (🤮 , I know)

Copy link
Author

@RyanLuMaye RyanLuMaye Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project seems to have been dead for years, so don't currently plan on contributing back, tis part of the reason I said f it and changed the build system. Can still rename if want though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I last talked with @MatthewFoster624 , the plan was to see if there were active forks and base our changes off of that in hopes of finding something worthy of our contribution.

If we're just keeping this to ourselves, I guess it's fine but I'd still think a rename would be better even if it's a nit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go with the rename. Even if it's a dead branch and we don't fork back to it, we are keeping this open source (as it should be).

Copy link

@MatthewFoster624 MatthewFoster624 Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Rename (keeping Kotlin is fine)

/**
* Generate and return a random String that match the pattern used in this Generex, and the string has a length >=
* <code>minLength</code> and <= <code>maxLength</code>
/**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the indentation is a little off here to line 462

assertThat(ratio).isLessThan(1.1)
}

companion object {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be private?


@After
public void tearDown() throws Exception {
for (String result : new Generex(pattern)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is off here

iterator.next();
}
iterator.remove();
for (String result : generex) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation of this block is off

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix the formatting issues in a separate PR. Believe its because the project currently uses a mix of tab and space characters.

… to JUnit5, and by updating the minimum Java target.

Removed custom Iterator / Iterable classes.
Migrated from a POM to Gradle build.
…exes.

Lengths of strings produced to match a regex more are now more likely to follow a uniform distribution.
Transitions are now weighted by the number of characters that can be used. This will produce a wider variety of strings on average.
Fixed issues relating to regex escape sequences.
@RyanLuMaye RyanLuMaye merged commit a7a0a87 into master Aug 7, 2023
2 checks passed
@RyanLuMaye RyanLuMaye deleted the RL/CORE-8480 branch August 7, 2023 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants