Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fourth version #33

Merged
merged 47 commits into from Aug 23, 2022
Merged

Fourth version #33

merged 47 commits into from Aug 23, 2022

Conversation

anamika-yadav99
Copy link
Collaborator

Addresses issue #30 and issue #32

@manulera
Copy link
Owner

Hi Anamika,

I haven't had a look at the code yet, but you should invest more time in the documentation. Your readme does not address the points that I raised in our previous email exchange:

You should explain what your pipeline does so that someone that comes to the repo for the first time understands it. This should cover:

  • How/where are allele features defined.
    • What do the output files look like, with some examples, e.g. take an example genotype and explain how the output in each file will be. Not addressed
    • How to download the fluorescent protein data (this could go in the readme section Getting the data). Not addressed
  • Also, the issue mentioned that there should be:
    • Step-by-step guide on how to run the pipeline, taking the public available strains in nbrp_strains. Should be improved
    • Instructions to write your own format.py, with some example. Should be improved

You should start the documentation of the pipeline by briefly explaining:

  • What is the starting data (lists of genotypes)
  • What information we want to extract from them
  • What is the extra input files that we use and what do they contain (what types of allele features, how they are defined in the toml files)
  • What does the output of a particular example line in strains.tsv look in:
    • All the output files of build nltk_tags
    • All the output files of summary_nltg_tags

Some other things I noticed so far:

  • summary_nltg_tags should be renamed to summary_nltk_tags
  • The tests should be improved. When testing, you should use a shorter test_strains.tsv, but verify that all generated files are correct, the ones produced by build_nltk_tags and summary_ntlk tags.

@manulera manulera merged commit f35eb72 into master Aug 23, 2022
@manulera manulera deleted the fourth_version branch August 23, 2022 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants