Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ExternalMappingSourceProcessor #271

Merged
merged 14 commits into from
Mar 6, 2022

Conversation

matvey-mtn
Copy link
Contributor

No description provided.

conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);

if (conn.getResponseCode() != 200) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (conn.getResponseCode() != 200) {
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {

Comment on lines 34 to 35
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants should be parameters of the processor, with reasonable defaults such that users will usually not to fill these in, but in case of a problematic remote server, they can set higher values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

this.mappingSourceUrl = new URL(mappingSourceUrl);
}

public Map<String, List<String>> getMappings() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public Map<String, List<String>> getMappings() {
public Map<String, List<String>> loadMappings() {

using the get* method pattern implies that you are reading a local field, not performing a remote call


private Pair<String, List<String>> toKeyValuePair(String inputLine) {
String[] keyVal = inputLine.split("=");
String[] split = keyVal[1].trim().split(",");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail on an IndexOutOfBounds if the input is faulty and contains no '='.
It is better to validate the values explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 65 to 70
private void startScheduledExecutor() {
ScheduledExecutorService scheduledRulesUpdaterExecutor = Executors.newScheduledThreadPool(1,
new ThreadFactoryBuilder().setNameFormat("refresh-mapping-executor").setDaemon(true).build());
scheduledRulesUpdaterExecutor.scheduleAtFixedRate(this::refreshExternalMapping, mappingRefreshPeriodInSeconds,
mappingRefreshPeriodInSeconds, TimeUnit.SECONDS);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every processor will have its own separate thread, which mostly does nothing. This seems wasteful.
I think this thread pool can be static, and lazy initialized.

}

private void refreshExternalMapping() {
keyValueMappingsCache = externalMappingsClient.getMappings();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread-safe. You need to mark this variable as 'volatile', or alternatively use the AtomicReference class (which does the same thing internally)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, shame on me 😄

List<String> targetField = doc.getField(TARGET_FIELD_NAME);
assertThat(targetField).isEmpty();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing some additional tests here for bad file formats: Empty file, file without '=' character, file with empty key or empty value.
All of these things will happen sooner or later.

Also, there should be a test for non-200 responses.

Copy link
Contributor Author

@matvey-mtn matvey-mtn Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added all test cases except

file with an empty key or empty value.

will add it tomorrow

@matvey-mtn matvey-mtn requested a review from barakm March 2, 2022 17:46
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {
throw new HttpRequestExecutionException("Couldn't load external mappings. Message: " + conn.getResponseMessage());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check for content-length, verifying that the downloaded file is overall < 50MB. Otherwise, a mega file could kill the JVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to do it in a separate PR. Will open a ticket for it.


try (BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()))) {
String inputLine;
while ((inputLine = in.readLine()) != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a validation for number of lines. More than 50K lines should fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to do it in a separate PR. Will open a ticket for it.

Copy link
Collaborator

@DanMelman DanMelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!
Added some comments.

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

* CR fixes

- added tests for escape characters
- added test with empty values
@matvey-mtn
Copy link
Contributor Author

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

@DanMelman I don't think we need this test. refreshExternalMapping() is already covered, and testing the ScheduledExecutorService doesn't make sense.

@DanMelman
Copy link
Collaborator

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

@DanMelman I don't think we need this test. refreshExternalMapping() is already covered, and testing the ScheduledExecutorService doesn't make sense.

Missed that test. this is indeed enough coverage.

@DanMelman
Copy link
Collaborator

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

@DanMelman I don't think we need this test. refreshExternalMapping() is already covered, and testing the ScheduledExecutorService doesn't make sense.

Which test is that? Still can't find it

@matvey-mtn
Copy link
Contributor Author

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

@DanMelman I don't think we need this test. refreshExternalMapping() is already covered, and testing the ScheduledExecutorService doesn't make sense.

Which test is that? Still can't find it

@DanMelman all tests that trigger processor.process(doc) will trigger refreshExternalMapping() on the first invocation.

@DanMelman
Copy link
Collaborator

Also, I would add a test for the regular mapping refresh. This might require changing the parameter from seconds to milliseconds to spare some test time.

@DanMelman I don't think we need this test. refreshExternalMapping() is already covered, and testing the ScheduledExecutorService doesn't make sense.

Which test is that? Still can't find it

@DanMelman all tests that trigger processor.process(doc) will trigger refreshExternalMapping() on the first invocation.

@matvey-mtn My intention was to test the map update/replacement, including the case where a reload failed and we need to validate the old map stayed

@matvey-mtn
Copy link
Contributor Author

Which test is that? Still can't find it

@DanMelman all tests that trigger processor.process(doc) will trigger refreshExternalMapping() on the first invocation.

@matvey-mtn My intention was to test the map update/replacement, including the case where a reload failed and we need to validate the old map stayed

@DanMelman ok, I will add something


public class ExternalMappingsClient {

private final Logger logger = LoggerFactory.getLogger(ExternalMappingsClient.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static!

private final int readTimeout;

public ExternalMappingsClient(ExternalMappingSourceProcessor.Configuration configuration) throws MalformedURLException {
this.mappingSourceUrl = new URL(requireNonNull(configuration.getMappingSourceUrl()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think requireNonNull is redundant, it will just be malformed url

Copy link
Collaborator

@DanMelman DanMelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@matvey-mtn matvey-mtn merged commit 8933f13 into master Mar 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants