Skip to content

API: Disallow Namespace with null byte character or null level in it#3938

Merged
rdblue merged 2 commits intoapache:masterfrom
kbendick:disallow-space-character-in-namespace-parts
Feb 14, 2022
Merged

API: Disallow Namespace with null byte character or null level in it#3938
rdblue merged 2 commits intoapache:masterfrom
kbendick:disallow-space-character-in-namespace-parts

Conversation

@kbendick
Copy link
Contributor

@kbendick kbendick commented Jan 20, 2022

In the REST catalog, we have decided to use a null byte character to delimit certain portions of a Namespace.

To prepare for that, we should disallow any level in a namespace which contains a null byte character (technically either the deprecated \0 or the preferred unicode character \u0000).

It also doesn't make sense for a level to be null, so I've added a check for that as well.

I added three new tests and I also tested the regular expression against a large number of patterns in a Scala REPL.

@github-actions github-actions bot added the API label Jan 20, 2022
@kbendick kbendick changed the title Core: Disallow Namespace with null level or level with NULL BYTE or o… API: Disallow Namespace with null byte character or null level in it Jan 20, 2022
@kbendick
Copy link
Contributor Author

kbendick commented Jan 20, 2022

cc @rdblue re checking for null bytes when instantiating Namespaces.

@kbendick kbendick changed the title API: Disallow Namespace with null byte character or null level in it [WIP] API: Disallow Namespace with null byte character or null level in it Jan 20, 2022
@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch from c5daf86 to a6375ed Compare January 28, 2022 22:14
@kbendick kbendick changed the title [WIP] API: Disallow Namespace with null byte character or null level in it API: Disallow Namespace with null byte character or null level in it Jan 28, 2022
@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch 3 times, most recently from 28375b3 to eb6a6ab Compare January 30, 2022 03:57
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the test is fixed

@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch 2 times, most recently from f3a117d to 1e49e8e Compare February 7, 2022 19:27
@kbendick
Copy link
Contributor Author

kbendick commented Feb 8, 2022

cc @rdblue if you could possibly merge this now that the tests are passing (I just rebased off latest master but they should still be passing).

@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch from 1e49e8e to 5e32b53 Compare February 8, 2022 18:52
private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
private static final Joiner DOT = Joiner.on('.');
private static final Predicate<String> CONTAINS_NULL_BYTE =
Pattern.compile("\0|\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between \0 and \u0000?

Copy link
Contributor Author

@kbendick kbendick Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is unicode and one isn't. I do notice that some of our systems complain when using \0 and not the full unicode \u0000 which is the preferred one.

To be safe, I just included both. Let me test and remove the ASCII one if it's not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using just \0000 is sufficient, so I removed the ASCII one.

@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch from 5e32b53 to 8eb4379 Compare February 14, 2022 19:52
@kbendick kbendick force-pushed the disallow-space-character-in-namespace-parts branch from 8eb4379 to e47b7f5 Compare February 14, 2022 19:54
public class Namespace {
private static final Namespace EMPTY_NAMESPACE = new Namespace(new String[] {});
private static final Joiner DOT = Joiner.on('.');
private static final Predicate<String> CONTAINS_NULL_BYTE =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BYTE probably isn't correct since this is checking for the null unicode codepoint. Maybe just CONTAINS_NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Maybe just CONTAINS_NULL_CHARACTER? CONTAINS_NULL makes it sound like we're checking for null itself, which this regex does not do.

@rdblue rdblue merged commit acf2681 into apache:master Feb 14, 2022
@kbendick kbendick deleted the disallow-space-character-in-namespace-parts branch February 14, 2022 23:10
@rdblue
Copy link
Contributor

rdblue commented Feb 14, 2022

Thanks, @kbendick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants