Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add javascript and regex parsers #36

Merged
merged 1 commit into from
Nov 9, 2015
Merged

Add javascript and regex parsers #36

merged 1 commit into from
Nov 9, 2015

Conversation

fjy
Copy link
Contributor

@fjy fjy commented Nov 7, 2015

No description provided.

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use newer license header

parsed
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you either add another UT or update this so that field names are explicitly specified and resulting map has something else and not "column_1"?

@fjy
Copy link
Contributor Author

fjy commented Nov 8, 2015

@himanshug @gianm Addressed all comments

@fjy fjy force-pushed the master branch 3 times, most recently from fe706a4 to 8d6407c Compare November 8, 2015 18:08
try {
final Object compiled = fn.apply(input);
if (!(compiled instanceof Map)) {
throw new ParseException("JavaScript parsed value must in {key: value} format!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"must be in"

@gianm
Copy link
Contributor

gianm commented Nov 8, 2015

👍

return Utils.zipMapPartial(fieldNames, Iterables.transform(values, valueFunction));
}
catch (Exception e) {
throw new ParseException(e, "Unable to parse row [%s]", input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not a big issue, but if matching fails then user would get nested ParseException, one from line 92 then that will get wrapped inside another here, should we catch specific exceptions instead of Exception ? or may be add another catch for ParseException that simple throws that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the two levels of parseexceptions will tell you exactly what is wrong and where it is wrong. I tested this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@himanshug
Copy link
Contributor

👍

final ScriptableObject scope = context.initStandardObjects();

final org.mozilla.javascript.Function fn = context.compileFunction(scope, function, "fn", 1, null);
Context.exit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to use ThreadLocal variables, which is interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what we do for all the javascript aggregators, filters, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move the context.exit to finally, to make sure any error does not leaves the thread local set ?

@xvrl
Copy link
Contributor

xvrl commented Nov 9, 2015

@fjy is there a reason those interfaces cannot be implemented inside of the druid repository? I'd rather not add a dependency to rhino just for that, if possible.

@fjy
Copy link
Contributor Author

fjy commented Nov 9, 2015

@xvrl we can submit another PR to move all these interfaces out of java-util

@drcrallen
Copy link
Contributor

Can you add unit tests for failure cases?

@drcrallen
Copy link
Contributor

What happens if the regex has an optional group?

@fjy
Copy link
Contributor Author

fjy commented Nov 9, 2015

@drcrallen what kind of failure cases?

@drcrallen
Copy link
Contributor

@fjy failure for the matcher in the regex, failure for functions returning non-map things for the javascript come to mind immediately.

@fjy
Copy link
Contributor Author

fjy commented Nov 9, 2015

@drcrallen can we do this in a later PR? file the issue for me


public RegexParser(
final String pattern,
final Optional<String> listDelimiter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about making the delimiter a regex as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitter.on(...) supports passing a pattern, that would make it a lot more powerful

}

final Object res = fn.call(cx, scope, scope, new Object[]{input});
return res != null ? Context.toObject(res, scope) : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Context.jsToJava(res, Map.class) to avoid having to cast things later on I think

xvrl added a commit that referenced this pull request Nov 9, 2015
Add javascript and regex parsers
@xvrl xvrl merged commit 0c23c6e into metamx:master Nov 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants