Add new synthesis recipes to API. #257

hhaoyan · 2021-04-26T13:54:56Z

Adding new schema for synthesis recipes as we want to incorporate the latest solid-state/sol-gel synthesis datasets at https://github.com/CederGroupHub/text-mined-synthesis_public.

Contributor Checklist

…d paragraph keywords (half-completed).

hhaoyan · 2021-05-06T22:44:28Z

Hi @mkhorton could you elaborate on what tests should be implemented for this particular API? Thanks!

mkhorton · 2021-05-06T22:54:25Z

Right now, no tests (shocking, I know!). We're just building up our test suite for this repo and figuring out what they should look like, starting with client tests, maybe @munrojm can comment.

In any case, this can be merged without tests if it's otherwise good.

munrojm · 2021-05-06T23:46:58Z

Hi @hhaoyan, this looks great so far! Thanks for putting it together. As @mkhorton said, I am in the process of getting a full initial suite of tests implemented, so don't worry about that portion.

Let us know when you feel it is ready to merge.

src/mp_api/routes/synthesis/models/core.py

hhaoyan · 2021-05-07T14:53:13Z

@mkhorton @munrojm I'm thinking to do some post processing in Query classes based on the query parameters. However, the current function def post_process(self, docs: List[Dict]) only accepts a single parameter docs. Is there a proper way to pass the query parameters to post_process?

munrojm · 2021-05-07T16:57:28Z

@hhaoyan, not currently. What sort of post-processing do you have in mind? Right now the idea is to have the post_process method run for both the search and key name endpoints. This would not support using values of the query parameters that are only seen in the former.

hhaoyan · 2021-05-07T17:12:16Z

I’m trying to selectively return part of a full text paragraph based on search keywords, which requires the post processor to have access to the search keywords. For example,if a search keyword is preset in the second sentence, then only that sentence will be returned. This is somewhat similar to what google search results look like. Is there a way I can achieve this?

munrojm · 2021-05-07T17:30:27Z

Have you looked into using an aggregation pipeline similar to what is in query_synth_text? I don't know all of your requirements, but each "passage" that is returned should be a short bit of text from the paragraph that is queried containing the search term. See here, https://docs.atlas.mongodb.com/reference/atlas-search/highlighting/.

Let me know if this looks like it will work. If not, we can figure out something else.

mkhorton · 2021-05-07T18:52:08Z

Yes, this is how I've done the highlighting on the current synthesis data -- I just set maxNumPassages to 1 and I believe that's below the 150 limit. Whether we post-process or not, that highlighting functionality is definitely what we want to use. It's fast and we've already written the frontend to handle the output.

hhaoyan · 2021-05-08T07:08:37Z

Thanks for the info @mkhorton @munrojm ! I looked at what had been done in /text_search/ and I think this should be enough. However I don't have access to a MongoDB Atlas subscription and couldn't test the search feature on my local computer. If that's possible, I'd like to have some access to a test environment so I can test & convert all the data records from our side.

hhaoyan · 2021-06-03T22:01:20Z

@mkhorton I think this could be merged now!

munrojm · 2021-06-03T22:03:27Z

@hhaoyan, can you confirm that you are able to query properly when running the API locally?

hhaoyan · 2021-06-03T22:06:02Z

Just had a conversation with @codytodonnell and it seemed to work well.

munrojm · 2021-06-03T22:09:20Z

@hhaoyan, okay great. Do you mind fixing the mypy linting issues before I merge? Don't worry about the second set of tests.

hhaoyan · 2021-06-03T22:12:33Z

OK sure!

Also, the code has 9 ensure_index function calls when each query is run . Do you think this would slow down the API? Do you have any suggestions on how to make ensure_index to be run just by once?

The code that runs 9 ensure_index is here:

https://github.com/materialsproject/api/pull/257/files#diff-03cb6bf4cb52b7cc45905fd54f7cb1e29314ae92bc32da135f78b526059d2e98R209

munrojm · 2021-06-03T22:19:19Z

I am going to merge main in with another branch that should allow me to define a separate query operator for this custom function. That should allow it to only run when in debug mode. I will take care of that change after this is merged.

For now, can you simply comment out that code block?

mkhorton · 2021-06-03T22:25:09Z

This looks great, my only comment would be to add a docstring to the data_adaptor files to clarify what they do, what data they take, etc. and for some of the functions in those files too.

hhaoyan · 2021-06-03T22:40:26Z

OK, the mypy lint was fixed and I added docstrings to data adaptors.

I also commented the block of ensuring indexes.

munrojm · 2021-06-03T22:43:37Z

Great, thanks @hhaoyan!

mkhorton · 2021-06-03T22:48:42Z

Thanks Haoyan!

…

On Thu, Jun 3, 2021 at 3:44 PM Jason Munro ***@***.***> wrote: Merged #257 <#257> into main. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#257 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAWWWRFBNXJZB5E6EISRM6LTRAATTANCNFSM43S6B33Q> .

hhaoyan · 2021-06-03T22:54:03Z

Thanks @mkhorton and @munrojm !

hhaoyan added 5 commits April 26, 2021 21:51

Add new synthesis recipes schema.

81b9023

Merge branch 'main' of github.com:materialsproject/api into main

d26c8e7

[WIP] add models to synthesis recipes and implement query classes

c08626c

[WIP] add query class for synthesis-type, experimental operations, an…

d7be2e7

…d paragraph keywords (half-completed).

[WIP] add script to convert dataset from the public repo to MP database.

ab5fc70

mkhorton reviewed May 7, 2021

View reviewed changes

src/mp_api/routes/synthesis/models/core.py Outdated Show resolved Hide resolved

Change synthesis type and operations into enum type.

06b1e67

Add experimental conditions query class.

5ea6641

hhaoyan added 10 commits May 27, 2021 14:17

Only keep one API endpoint for all recipe calls.

c3eabff

Merge branch 'main' into main

3deed4a

Fix ellipsis function for removing heading characters.

2e987b0

Remove debugging print statement.

c46af2c

Return total number of hits.

0e88c07

Add adaptor that converts synpro collections.

0fa6986

Allow min/max value to be set as None.

e33609e

handle cases when aggregate returns zero docs

33c9490

Let mongodb return all highlights and handle char limits by ourselves

e9177ba

Use str for targets_formula/precursors_formula

b3d0d23

hhaoyan added 2 commits June 4, 2021 06:29

Fix mypy and comment ensure_index calls

456773d

Add docstrings and comments to data adaptors.

3d9eaa1

munrojm merged commit 80f6599 into materialsproject:main Jun 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new synthesis recipes to API. #257

Add new synthesis recipes to API. #257

hhaoyan commented Apr 26, 2021 •

edited

Loading

hhaoyan commented May 6, 2021

mkhorton commented May 6, 2021

munrojm commented May 6, 2021

hhaoyan commented May 7, 2021

munrojm commented May 7, 2021

hhaoyan commented May 7, 2021

munrojm commented May 7, 2021

mkhorton commented May 7, 2021

hhaoyan commented May 8, 2021 •

edited

Loading

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021 •

edited

Loading

mkhorton commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

mkhorton commented Jun 3, 2021 via email

hhaoyan commented Jun 3, 2021

Add new synthesis recipes to API. #257

Add new synthesis recipes to API. #257

Conversation

hhaoyan commented Apr 26, 2021 • edited Loading

Contributor Checklist

hhaoyan commented May 6, 2021

mkhorton commented May 6, 2021

munrojm commented May 6, 2021

hhaoyan commented May 7, 2021

munrojm commented May 7, 2021

hhaoyan commented May 7, 2021

munrojm commented May 7, 2021

mkhorton commented May 7, 2021

hhaoyan commented May 8, 2021 • edited Loading

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021 • edited Loading

mkhorton commented Jun 3, 2021

hhaoyan commented Jun 3, 2021

munrojm commented Jun 3, 2021

mkhorton commented Jun 3, 2021 via email

hhaoyan commented Jun 3, 2021

hhaoyan commented Apr 26, 2021 •

edited

Loading

hhaoyan commented May 8, 2021 •

edited

Loading

munrojm commented Jun 3, 2021 •

edited

Loading