Trim strings in znrecord fields for REST json content by huizhilu · Pull Request #1499 · apache/helix

huizhilu · 2020-10-30T05:53:50Z

Issues

My PR addresses the following Helix issues and references them in the PR description:

Description

Here are some details about my PR, including screenshots of any UI changes:

If users accidentally put a leading space in the json payloads and call helix rest to update cluster configs, the leading space is not correctly handled and causes configs not to be appropriately parsed. Then it leads to some issues in helix that relies on the configs.

This PR trims strings in the znrecords fields after the REST API's json payloads are deserialized to a znrecord.

Tests

The following tests are written for this issue:

TestZNRecordUtil#testTrimFields

The following is the result of the "mvn test" command on the appropriate module:

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   TestClusterAccessor.testUpdateRESTConfig:718->AbstractTestClass.post:529 expected:<200> but was:<500>
[INFO]
[ERROR] Tests run: 170, Failures: 1, Errors: 0, Skipped: 1
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:46 min
[INFO] Finished at: 2020-10-29T23:12:16-07:00
[INFO] ------------------------------------------------------------------------

Documentation (Optional)

In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters (not including Jira issue reference)
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Code Quality

My diff has been formatted using helix-style.xml
(helix-style-intellij.xml if IntelliJ IDE is used)

jiajunwang · 2020-10-30T06:23:26Z

      throws IOException {
-    return OBJECT_MAPPER.reader(ZNRecord.class).readValue(data);
+    ZNRecord record = OBJECT_MAPPER.readValue(data, ZNRecord.class);
+    return ZNRecordUtil.trimFields(record);


Here's an interesting problem.
Since the rest is not covering all of our APIs, so assuming there is a user use java API sets " test" = 123. Then in the UI, user got "test" = 123 (due to this trim). Then user modifies it to be "test" = 456. After this change, in ZK, we now have both keys.

" test" = 123
"test" = 456

To who reads the ZK directly, he or she does not know which one is the right value.
To who reads it from UI, he or she will either get a duplicate key exception or see a random value depends on which value overwrites the other one.

I think this problem is hard to address.
That's why I think we still should try to fix the UI : )

@jiajunwang Thanks for the comment. There are many things to consider regarding this consistent problem

In the problem you described, two entries " test" = 123, "test" = 456 will show on UI. Then the user will see them and tries to fix it: removing one of them.

There are many ways to change the znode value: UI, rest API, java API, zooInspector, zk client, etc.. We cannot prevent users from setting the value with a leading space either by mistake or on purpose.

Regarding leading/trailing spaces, I don't think they make sense, at least 99% are useless and human errors and cause problems. I wonder why don't we make an agreement: either we enforce to trim the strings at a layer(REST/Java API before writing to zk, or when reading the configs to tolerate spaces, because we cannot enforce it for all the ways that can change znodes), or just leave it as is, requiring the end users to be responsible for it (then when there is an issue, they would still come to us for help troubleshoot).

From the past experience, the mistakes are usually made by UI or REST API. We cannot prevent this from all layers, but considering the pain it brought to us, we can make tradeoffs and do better for 99% cases that use UI/REST and release us. Then if we really encounter the 1% case as you said, the duplicate configs will show on UI as " test" = 123, "test" = 456. At least the problem is exposed earlier, then let the user resolve the problem.

I still doubt UI has the bug: auto adding leading spaces, because we could not reproduce it and I could not find any code that would add the leading space. If it has the bug, I think this is what you meant by fix the UI, I completely agree that we fix it. But if it doesn't, trimming is just an improvement. Even we do the improvement on UI, we cannot prevent from other ways. There'll still be the problem as you said.

You mean that we only control the write path, but not the read path, right? In this case, if the user doesn't want to fix them, but decide to change a value first. Then due to the same logic, when they update the value of " test" key, their request will be really applied to "test" key. I think it might introduce bugs or at least confusion. Unless we block any update when ambiguous keys are found. It might be a good way.

It is true that we cannot prevent all random write.
My thought is that, eventually, we need to lock down ZK write and only allow Helix REST access. Then we can do whatever you described above.
Before that, instead of introducing a complicated logic that won't be 100% working in all conditions, just high-light the suspicious keys in UI and it shall alert the user and the admin can fix it fast.

Here's an idea. The current implementation is fixing & modifying the user-provided input silently (without informing the user) to make the input conform to what we think is the right thing. This is a hidden assumption that will inevitably become harder to track and will leave both the user and Helix maintainers confused at one point in the future.

Why don't we just fail it if the input is invalid (input validation)? This way, the user will know, and as some reviewers pointed out above, we do not run the risk of writing to an existing value causing unwanted writes.

You mean that we only control the write path, but not the read path, right? In this case, if the user doesn't want to fix them, but decide to change a value first. Then due to the same logic, when they update the value of " test" key, their request will be really applied to "test" key. I think it might introduce bugs or at least confusion. Unless we block any update when ambiguous keys are found. It might be a good way.

It is true that we cannot prevent all random write.

My thought is that, eventually, we need to lock down ZK write and only allow Helix REST access. Then we can do whatever you described above.

Before that, instead of introducing a complicated logic that won't be 100% working in all conditions, just high-light the suspicious keys in UI and it shall alert the user and the admin can fix it fast.

I actually was thinking we just warn users that the field has leading/trailing spaces. Then on UI, users know the info from the warning. But that only applies to UI, not to REST. And it is actually an improvement/feature, not a bug fix. It'll need some JavaScript work on ui. I think we can definitely do the improvement if we have a UI engineer. I was just concerned about that currently we don't have he capacity for the improvement. If someone can implement this feature, it'll help users find out the potential incorrect configs on UI.

At rest layer with this, at least we can avoid 99% of human errors that introduce the spaces. Like I said, I am not sure if a user finds " test" exists, why they still want to keep it and update its value, rather than use the appropriate one "test". In normal case, if they update " test" and then find that there are two entries:" test" and "test", they would just delete one of them, most possibly " test".

narendly · 2020-10-30T06:32:31Z

+    Map<String, String> simpleFields = record.getSimpleFields().entrySet().stream()
+        .collect(Collectors.toMap(e -> e.getKey().trim(), e -> e.getValue().trim()));
+
+    Map<String, List<String>> listFields = record.getListFields().entrySet().stream()
+        .collect(Collectors.toMap(e -> e.getKey().trim(),
+            e -> e.getValue().stream().map(String::trim).collect(Collectors.toList())));
+
+    Map<String, Map<String, String>> mapFields = record.getMapFields().entrySet().stream()
+        .collect(Collectors.toMap(e1 -> e1.getKey().trim(),
+            e1 -> e1.getValue().entrySet().stream()
+                .collect(Collectors.toMap(e2 -> e2.getKey().trim(), e2 -> e2.getValue().trim()))));
+
+    ZNRecord trimmedRecord = new ZNRecord(record);
+    trimmedRecord.setSimpleFields(simpleFields);
+    trimmedRecord.setListFields(listFields);
+    trimmedRecord.setMapFields(mapFields);
+
+    return trimmedRecord;
+  }


Two comments:

technically spaces are allowed in names I believe? sure it's ugly, but we haven't enforced the "no-leading/trailing spaces" rule, so these are valid strings as names.

this is probably more minor, but doing trimming this way right here might not be so ideal because this might increase gc pressure / heap usage.

all in all, I am also of the opinion that we enforce this at the UI level. this is users' responsibility to make sure their input is correct after all.

more discussion is in this thread: https://github.com/apache/helix/pull/1499/files#r515354253

At rest api layer, when this toZNRecord() is called, it is usually creating/updating configs. I don't think the QPS for updating configs is high at all. So I wouldn't worry about gc/heap pressure.

I don't think that thread is answering my question. my original question is, who decided " abcname" or "abcname " were invalid? i'm just playing the devil's advocate here - it's perfectly fine for a user to create a partition named "xxx ", for example?

kaisun2000

There are many good discussions. But unless we make some actions like this first or we can fix the UI issue quickly. There can be further production issues, which would take the team a lot of effort trouble shoot.

@pkuwm, can you have a look to see if UI fix can be done quickly. I will give an approve for now.

jiajunwang · 2020-11-02T21:08:24Z

@pkuwm I feel the PR worths a discussion. Could you please schedule a quick discussion so we can bring everyone onto the same page?

huizhilu · 2020-11-03T19:39:09Z

There are many good discussions. But unless we make some actions like this first or we can fix the UI issue quickly. There can be further production issues, which would take the team a lot of effort trouble shoot.

@pkuwm, can you have a look to see if UI fix can be done quickly. I will give an approve for now.

@kaisun2000 @jiajunwang Thanks for the review! I still doubt the UI auto adds a leading space. We'll discuss more to determine a fair solution to improve it, eg: validate the the input and give warning/bad request response, or trim the spaces like this PR, etc..

Trim strings in znrecord fields for REST json content

e81279f

huizhilu force-pushed the trim branch from b9e8d6d to e81279f Compare October 30, 2020 06:07

jiajunwang reviewed Oct 30, 2020

View reviewed changes

narendly reviewed Oct 30, 2020

View reviewed changes

kaisun2000 approved these changes Nov 2, 2020

View reviewed changes

huizhilu closed this Dec 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trim strings in znrecord fields for REST json content#1499

Trim strings in znrecord fields for REST json content#1499
huizhilu wants to merge 1 commit into
apache:masterfrom
huizhilu:trim

huizhilu commented Oct 30, 2020 •

edited

Loading

Uh oh!

jiajunwang Oct 30, 2020

Uh oh!

huizhilu Oct 30, 2020

Uh oh!

jiajunwang Oct 31, 2020

Uh oh!

narendly Oct 31, 2020 •

edited

Loading

Uh oh!

huizhilu Oct 31, 2020

Uh oh!

narendly Oct 30, 2020

Uh oh!

huizhilu Oct 30, 2020

Uh oh!

narendly Oct 31, 2020

Uh oh!

kaisun2000 left a comment

Uh oh!

jiajunwang commented Nov 2, 2020

Uh oh!

huizhilu commented Nov 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

huizhilu commented Oct 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues

Description

Tests

Documentation (Optional)

Commits

Code Quality

Uh oh!

jiajunwang Oct 30, 2020

Choose a reason for hiding this comment

Uh oh!

huizhilu Oct 30, 2020

Choose a reason for hiding this comment

Uh oh!

jiajunwang Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

narendly Oct 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huizhilu Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

narendly Oct 30, 2020

Choose a reason for hiding this comment

Uh oh!

huizhilu Oct 30, 2020

Choose a reason for hiding this comment

Uh oh!

narendly Oct 31, 2020

Choose a reason for hiding this comment

Uh oh!

kaisun2000 left a comment

Choose a reason for hiding this comment

Uh oh!

jiajunwang commented Nov 2, 2020

Uh oh!

huizhilu commented Nov 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

huizhilu commented Oct 30, 2020 •

edited

Loading

narendly Oct 31, 2020 •

edited

Loading