Skip to content

Jerome3590/TextMining

Repository files navigation

TextMining

Text Mining Mind Map

Step 1. Get the data. - PyCharm w bipython package/pubmed API https://github.com/Jerome3590/TextMining/blob/master/BoneMarrow_medpub.py

Step 2. Explore the data - Elasticsearch with Kibana - Identify target fields of interest - Elasticsearch query with _sourcefilter - curl command to output file https://github.com/Jerome3590/TextMining/blob/master/MedPub_script.bash.sh https://github.com/Jerome3590/TextMining/blob/master/marrow.json

Step 3. Format the data for processing - Credit to PyRVA's Chris May for helping me put this script together
https://github.com/Jerome3590/TextMining/blob/master/JSON_Processing.py (for Elasticsearch output) https://github.com/Jerome3590/TextMining/blob/master/JSON_Processing3.py (for pubmed API output) https://github.com/Jerome3590/TextMining/blob/master/outfile.json https://github.com/Jerome3590/TextMining/blob/master/pubMed.csv (CSV version)

Step 4. Choose NLP Model based on Requirements - Additional file processing depending on NLP model selection

Step 5. Perform NLP and Analyze Results with Databricks

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published