Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Mongoconverter #1

Merged
merged 49 commits into from Nov 30, 2018

Conversation

wuxiaohua1011
Copy link
Contributor

mongoDB Converter made

wuxiaohua1011 and others added 30 commits September 17, 2018 09:55
…or Lark input.

output parsing is now working for simple queries. Still need to figure out how to do layered query

TODO:
1. How to parse nested query such as a > 1 AND b < 1 AND (c < 1 or d < 2), current method will seperate each query into sub queries, while the desired output is having a general and statement outside while a or statement inside.
2. need to investigate whether seperating it will actually affect output. If no, then no need to change
3. input such as nelements="Si,O" does not work. Figure out why
…d to do is to parse the string before hand so that the string is correctly passed into pql
does not interpret "or"
TODO:
fix pql problem with input with symbols such as a < -1
1. created a new version of grammar --> v0.9.6.g
2. v0.9.6.g(NEW) and v0.9.5.g(OLD) mainly differ in the following:
	- NEW removed the mandatory parenthesis checking in OLD in the atom level, note that OLD scheme will fail with this input: filter=(a<0 or b>1)
	- NEW replaced OR and AND to keyword CONJUNCTION in both expression and term level, CONJUNCTIOn is OR | AND, since we can have both possibilities
	- NEW removed andcomparison, since it is obsolete now
	- NEW incorporated parenthesis **Preservence** by adding "(" checking in the term level, and ")" checking in the comparison level, since only a term needs to know where does a parenthesis start and end

Made sure that transformer can parse the new tree

the schematic of the tree and cleanPQL is as follow:
for every term will be surrounded by parenthesis, regardless of whether the original input has parenthesis or not.

ex: filter = (a=0 and b=1) or c=2

tree generated will be:
expression
	term
		term
			term
				atom -> a=0
			and
			term
				atom -> b=0
		or
		term
			atom -> c=2

and the cleanPQL generated after parsing through the tree will be:
(((a==0) and (b==1)) or (c == 2))

** Preservence here means that I do not check for whether the user entered the right amount of parenthesis, just record if it is there. Ex of valid input into Lark: (a < 0
…i,O,X'

todo:
write more test and documentation
passed: test_one_input passed
passed: test_two_inputs_with_and
passed: test_two_inputs_with_or
passed: test_valid_numbers_positive
passed: test_multiple_entries
passed: test_mixing_upper_case_and_lower_case
passed: test_float
passed: test_scientific_number
passed: test_negative_number
testing function restructured
python main.py "filter=a<1" -v "(1,2,3)" -a "{'a':'b'}"

 Date:      Fri Nov 2 16:13:41 2018 -0700
@dwinston
Copy link
Contributor

This is a work in progress (WIP) to add a lark.Transformer for the lark.Tree resulting from parsing a filter according to a given optimade grammar spec. The transformer returns a python-like format for the filter, which is then fed to the pql (python expression to MongoDB query translator) library to yield a MongoDB query filter equivalent to the optimade filter string. User can configure aliases from standard optimade entry fields to the fieldnames in the database, e.g. 'nelements' may map to 'num_elements' in the host collection.

@wuxiaohua1011 is a Berkeley undergrad interning with the Materials Project.

@wuxiaohua1011 wuxiaohua1011 deleted the mongoconverter branch November 30, 2018 21:18
@wuxiaohua1011 wuxiaohua1011 restored the mongoconverter branch November 30, 2018 21:20
@wuxiaohua1011 wuxiaohua1011 reopened this Nov 30, 2018
@dwinston dwinston merged commit 20d52c1 into Materials-Consortia:master Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants