Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outreachy Code Project: Name: Anamika Yadav #92

Closed
26 tasks done
anamika-yadav99 opened this issue Apr 3, 2022 · 19 comments
Closed
26 tasks done

Outreachy Code Project: Name: Anamika Yadav #92

anamika-yadav99 opened this issue Apr 3, 2022 · 19 comments

Comments

@anamika-yadav99
Copy link
Contributor

anamika-yadav99 commented Apr 3, 2022

Applicant: https://github.com/anamika-yadav99

Welcome to the Ersilia Open Source Initiative. This issue will serve to track all your contributions for the project “Improve the Ersilia Model Hub, a FOSS platform offering pre-trained AI/ML models for research”.

Please tick the tasks as you complete them. To make a final application it is not required to have completed all tasks. This project requires knowledge of the Python programming language. The tasks are not ordered from more to less important, they are simply related to different skills. Start where you feel most comfortable.


Initial steps

  • Record your application for the project in the Outreachy website referencing this issue. Please make sure to select the right project on the website.
  • Join the Slack channel to follow public communications.
  • Comment under this issue explaining why you are interested in this project.

Installation of the Ersilia Model Hub

  • Install the ersilia library.
  • Add a screenshot under this issue showing that you are able to run one model (for example, the chemprop-antibiotic model)
  • Fetch at least 3 models from the Ersilia Model Hub. You can find these models with the ersilia catalog command. Add a screenshot of the local catalog (ersilia catalog –local)

CLI

  • Check if there are open issues related to the command line interface. Continue with the next tasks if they are open.
  • Select one issue related to improving the CLI and request to be assigned to it.
  • Link the #PR as a comment under this issue.
  • Make any changes required in the PR and tick this box once it has been approved.
  • Suggest at least one missing feature in the CLI (one sentence is enough, for example: “Add command to estimate memory usage of a particular model”).

Python library

  • Add a screenshot showing that you are able to run predictions using ersilia as a Python library (find more information here). Ideally, use a Jupyter notebook.
  • Create a simple Streamlit app using the ersilia Python library. The app can have an input and an output box, and perhaps a few models to select. Add a screenshot of the app as seen in your browser.
  • Write a docstring for the ErsiliaModel class. Use the Google Python Style guide. Paste the docstring as a comment below (do not use a PR).

Scientific content

  • Check the models available in the Hub
  • Select one model from the list and write a technical card (what is the model for, what input, which data was used to create it, what kind of ML algorithm uses…) for it
  • Add your card as a comment to this issue
  • Search the scientific literature and suggest 3 new models (comment in this issue) that would be relevant to incorporate in the Hub.

Other

If you have interest in working on related topics, or have new suggestions, please do the following

  • Add a comment in this issue with your new idea, tagging the mentor
  • Get feedback from the mentor and act accordingly
  • Link in the comments any other PR you have contributed to.

Community

  • Look up two other projects and comment on their issues with feedback on one of their tasks
  • If you have feedback from your peers, answer it in this issue.

Final application

  • I have answered all comments from mentors and contributors
  • All PR or issues assigned to me are complete
  • I have submitted my final application to the project
@camus60
Copy link

camus60 commented Apr 3, 2022

@anamika-yadav99 Is this the format that issue is meant to have going forward?

@GemmaTuron
Copy link
Member

Hi @anamika-yadav99
You have done some good work on the CLI! Can you link the relevant issues / PR's that you have contributed to? This way we can better follow your work!
Thanks

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 8, 2022

@GemmaTuron I have completed most of these tasks. Can I do it after a couple of days? I have my exams going on currently.

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 10, 2022

I have fetched and run chemprop-antibiotic model
image

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 10, 2022

Fetched 3 models
image

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 10, 2022

Select one issue related to improving the CLI and request to be assigned to it.

I was assigned issue #13
to add model search functionality in CLI : $ersilia catalog --text "chemprop-antibiotic" and $ersilia catalog --mode 'pretrained' to cli.

image

I have completed the task and the issue was closed.
Pull Request successfully merged : #41

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 10, 2022

Suggest at least one missing feature in the CLI

  1. Discussion on issue Improving the search engine in https://airtable.com/shrUcrUnd7jB9ChZV/tblZGe2a2XeBxrEHP #60 . I had suggested to add tag word to the model card which could be used to search for model related to they tag keywords. Later these keywords could also be added to $ersilia --help.
    Improving the search engine in https://airtable.com/shrUcrUnd7jB9ChZV/tblZGe2a2XeBxrEHP #60 (comment) by @miquelduranfrigola approving the idea.
  2. Improve Model search functionality to display the closely related model in case of typo or wrong spelling in Model catalog table search. Feature Request: Improve model search functionality #241

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 10, 2022

Link in the comments any other PR you have contributed to.

  1. issue Make ersilia conda-installable #9
    Added conda recipe which is used to build conda package . Then uploaded my built package to anaconda to test the
    package.
  • To build conda recipe for Ersilia I used the recipe used by Tensorflow a few years ago. The recipe downloads the .whl file of Ersilia and then builds it in conda using conda build command.

Raised PR for the same.
PR #89 successfully merged
2. Better model catalog display #12

  • I improved the model display according to discussion in comment Better model catalog display #12 (comment)

  • I added a class to slice the catalog table into table of 15.

  • I also added ersilia catalog --next and ersilia catalog --previous in cli which displays the next and previous table
    consequently.

  • I have done so by storing a counter in an auxillary file. Every time the command ersilia catalog is called the counter is set to 1 and table 0 is displayed. Counter variable is used as index for next table. ersilia catalog --next displays the next table and increases the counter by 1. In case, the user enters command even after the table is finished the counter is frozen to index of last table and table and the user is prompted to return back. Vise versa in ersilia catalog --previous.
    Raised a pr.
    PR Enhanced Model catalog display #238

3. Improved ModelSearcher functionality as discussed in issue #241

  • I have added 2 methods to ModelSearcher Class to search using --text and --mode . To search using --text search_text method is added which also performs fuzzy match on the input to the data in catalog table. Fuzzy match is performed on model_id, description and slug.

Raised a PR #262

@miquelduranfrigola
Copy link
Member

Hi @anamika-yadav99 thanks for documenting all of this so nicely

@anamika-yadav99
Copy link
Contributor Author

Add a screenshot showing that you are able to run predictions using ersilia as a Python library (find more information here). Ideally, use a Jupyter notebook.
image

@Amna-28
Copy link
Contributor

Amna-28 commented Apr 16, 2022

Link in the comments any other PR you have contributed to.

  1. issue Make ersilia conda-installable #9
    Added conda recipe which is used to build conda package . Then uploaded a package from my end to test the package.
    Raised PR for the same.
    PR added conda recipe  #89 successfully merged.
  2. Currently working on Better model catalog display #12 Implementing the suggestions mentioned in the comment
    Better model catalog display #12 (comment)

Hi @anamika-yadav99 , you have done a great job over all. I really like your approach in solving issue #13 add model search functionality in CLI . Well done!

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 17, 2022

Link in the comments any other PR you have contributed to.

  1. issue Make ersilia conda-installable #9
    Added conda recipe which is used to build conda package . Then uploaded a package from my end to test the package.
    Raised PR for the same.
    PR added conda recipe  #89 successfully merged.
  2. Currently working on Better model catalog display #12 Implementing the suggestions mentioned in the comment
    Better model catalog display #12 (comment)

Hi @anamika-yadav99 , you have done a great job over all. I really like your approach in solving issue #13 add model search functionality in CLI . Well done!

Thanks @Amna-28 . You did an amazing job with the streamlit app yourself. The app looks great!

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 17, 2022

Search the scientific literature and suggest 3 new models (comment in this issue) that would be relevant to incorporate in the Hub.

DeepDTA: deep drug–target binding affinity prediction:
https://academic.oup.com/bioinformatics/article/34/17/i821/5093245.

AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery:
https://arxiv.org/abs/1510.02855

ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations:
https://arxiv.org/abs/1908.01425

DEEPScreen: high performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations
https://pubs.rsc.org/en/content/articlelanding/2020/sc/c9sc03414e

@anamika-yadav99 anamika-yadav99 changed the title Outreachy Code Project: Name: Anamika Yadav ; Github : https://github.com/anamika-yadav99 Outreachy Code Project: Name: Anamika Yadav Apr 17, 2022
@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 17, 2022

Write a docstring for the ErsiliaModel class. Use the Google Python Style guide.

image

@GemmaTuron Will this be all or should I also write for the methods in ErsiliaModel Class?

@dauinh
Copy link
Contributor

dauinh commented Apr 17, 2022

Hi @anamika-yadav99, I think that your CLI feature suggestion is a really nice idea

@Amna-28
Copy link
Contributor

Amna-28 commented Apr 17, 2022

Link in the comments any other PR you have contributed to.

  1. issue Make ersilia conda-installable #9
    Added conda recipe which is used to build conda package . Then uploaded a package from my end to test the package.
    Raised PR for the same.
    PR added conda recipe  #89 successfully merged.
  2. Currently working on Better model catalog display #12 Implementing the suggestions mentioned in the comment
    Better model catalog display #12 (comment)

Hi @anamika-yadav99 , you have done a great job over all. I really like your approach in solving issue #13 add model search functionality in CLI . Well done!

Thanks @Amna-28 . You did an amazing job with the streamlit app yourself. The app looks great!

Thank you so much @anamika-yadav99

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 20, 2022

Create a simple Streamlit app using the ersilia Python library. The app can have an input and an output box, and perhaps a few models to select. Add a screenshot of the app as seen in your browser.
image

image

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 21, 2022

Add your card as a comment to this issue
image

@anamika-yadav99
Copy link
Contributor Author

Comment under this issue explaining why you are interested in this project.

I’m a 3rd year undergraduate student from New Delhi, India. I’m currently pursuing engineering from GGS Indraprastha University, New Delhi. I’m a big supporter of open science and open source. I’m intrigued by the support and opportunities open source could provide to the scientific community and people. I have always wanted to work on projects which build tools for the scientific research community. It’s one of the driving forces to pursue this project. None the less, this project is aimed for developing country.
I'm passionate about the application of ML in healthcare, drug discovery, genetic engineering etc . I wish to further pursue post grad in one of the fields like computational biology, biotechnology, AI+health(not sure, still exploring) but somewhere along the lines. This project gives me an opportunity to work closely with the ML papers in drug discovery.
I'm fluent in python and very well familiar with implementing ML papers, ML Toolboxes. This project is apt for my skillset and interest.

gitbook-com bot pushed a commit that referenced this issue Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants