cannabis-strains

Scraping of online reviews of cannabis strains and performing exploratory data analysis, including data cleaning, data transformations, and unsupervised association rule learning to mine associations between strain flavours and their medicinal properties.

Uses a combination of Selenium and BeautifulSoup + requests to do the scraping portion, followed by data manipulation and visualization with pandas and Plotly, and association rule learning with apriori.

To reproduce:

Clone the repository or download as ZIP and unzip it somewhere.
Run link_crawl.py using Python 3. You'll need the Selenium Python bindings as well as the latest version of Chromedriver.
Then run strain_scrape_standard.py to produce a CSV file with all the strain data nicely organized into columns and labelled.
Finally, use Jupyter notebook to run cannabis_strains_EDA.ipynb, where you can run individual cells, the entire notebook, or save the output to HTML. The HTML version of the notebook is available on my site.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Links		Links
Results		Results
LICENSE		LICENSE
README.md		README.md
cannabis_strains_EDA.ipynb		cannabis_strains_EDA.ipynb
flavour_scrape_test.py		flavour_scrape_test.py
link_crawl.py		link_crawl.py
strain_attrs.py		strain_attrs.py
strain_scrape_standard.py		strain_scrape_standard.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cannabis-strains

About

Releases

Packages

Languages

License

MihaiChelaru/cannabis-strains

Folders and files

Latest commit

History

Repository files navigation

cannabis-strains

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages