Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #1767 - RegexSwap client reimplementation based on JSON endpoint #1896

Merged
merged 4 commits into from Jan 5, 2022

Conversation

kaspersorensen
Copy link
Member

As described and discussed in #1767 - this changes the RegexSwapClient class to use https://datacleaner.github.io/content/regexes.json for fetching regexes instead of the decomissioned old XML endpoints.

final ObjectMapper objectMapper = new ObjectMapper();
root = objectMapper.readValue(json, Map.class);
} catch (IOException e) {
throw new UncheckedIOException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this warn the user, or is it handled somewhere else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really mean "more specific". I meant more like a proper "we couldn't contact the server" local warning instead of just dumping it up the chain.

But I might misremember how we handle exceptions and this is all handled properly somewhere (though "someone will probably handle it" seems like a bad strategy).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, but since this is a engine/backend component I think that throwing an exception is most appropriate. We could add more logging and/or make a specific exception for this scenario. But then again I do kinda think this is also just about wrapping the checked exception in an unchecked exception and deciding not to try and handle it, because we have no good way to recover from it here.

}
return regexes;
}
private static final URI REGEXES_URI = URI.create("https://datacleaner.github.io/content/regexes.json");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the URL as a hardcoded value may leave us open to a similar issue again later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes but I find it hard to imagine a situation where we wouldn't have to pull a release to fix it anyway?
Is your suggestion that we use an environment variable or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just some kind of fallback. I guess an environment variable that overrides the default would be perfect for that.

Just so that someone who doesn't want to update for compatibility or whatever reason (or who wants a custom set of regexes hosted locally), can update the URL.

But its no biggie.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed.

@LosD
Copy link
Contributor

LosD commented Jan 3, 2022

Code itself looks mostly good to me, but I see some changes from spaces to tabs in RegexSwapClient.java and Regex.java. If I remember correctly, the code style is spaces.

@kaspersorensen
Copy link
Member Author

Code itself looks mostly good to me, but I see some changes from spaces to tabs in RegexSwapClient.java and Regex.java. If I remember correctly, the code style is spaces.

OMG yes. This is what happens when you (I) don't code for a long time and go rogue with a fresh Eclipse installation :D Thanks for the feedback notes!

Copy link
Contributor

@LosD LosD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real blockers, though I'm not sure I agree with the exception strategy. But if I remember correctly, this is just using the new-ish Oracle approved way to strip checked exceptions that we (unfortunately) already did in several places, so I guess it's not really a step backwards.

@kaspersorensen kaspersorensen merged commit 5793dc7 into master Jan 5, 2022
@kaspersorensen kaspersorensen deleted the feature/1767-regexswap-json-client branch January 5, 2022 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants