Skip to content

An HTML parser for "KEGG Mapper – Reconstruct Result" pages

Notifications You must be signed in to change notification settings

JulieMarieCharmillon/kegg_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

kegg_parser

An HTML parser command line app for "KEGG Mapper – Reconstruct Result" pages.

Setup

Built with python3.9

kegg_parser.py uses five modules: beautifulsoup4*, pathlib, json, csv and argparse. Be sure to have all of them up-to-date on your environment.

txt_ko_list.py uses pandas*, pathlib and argparse.

* not built-in modules


On your terminal use the following command line:

python txt_ko_list.py "initial_tsv_file.tsv"

Returns a txt file with the KO list you want to use in the kegg Reconstruct tool (https://www.genome.jp/kegg/mapper/reconstruct.html)

Output file will be find in the folder containing your tsv file under the name "initial_tsv_file_name_KO_LIST.txt"


On your terminal use the following command line:

python kegg_parser "KEGG Mapper – Reconstruct Result.html" modules

Returns a csv file with the list of all the KO from the KEGG Mapper – Reconstruct Result modules pages (under the name: "KEGG_MODULES_Grouped_KO.csv") and a json file with the corresponding dictionary (under the name: "KEGG_MODULES_dictionary.json").

It works with all "views" : "complete only", "including 1 block missing" and "including any incomplete". But please download the "views" pages separately.

Output files will be found in a "result" subfolder within the folder containing the initial "KEGG Mapper – Reconstruct Result.html" file.


On your terminal use the following command line:

python kegg_parser "KEGG Mapper – Reconstruct Result.html" pathways

Returns a csv file with the list of all the KO from the KEGG Mapper – Reconstruct Result pathways pages (under the name: "KEGG_PATHWAYS_Grouped_KO.csv") and a json file with the corresponding dictionary (under the name: "KEGG_PATHWAYS_dictionary.json").

Output files will be found in a "result" subfolder within the folder containing the initial "KEGG Mapper – Reconstruct Result.html" file.

Releases

No releases published

Packages

No packages published

Languages