Skip to content

Shashankti/ISB_Data_Scrapping

Repository files navigation

ISB_Data_Scrapping

  • The repo contains scripts utilised for scrapping data from the CDSCO website about the quality of drugs tested.
  • The raw data can be obtained from the Onedrive folder.
  • Later analysis will be updated here

List of files

  • Scraping.R - R script for scraping the CDSCO website
  • PDF_Scraper.R - R script for scraping the tables from PDF files.
  • rename.R - R script for bulk renaming for files
  • Merge_Data.R - R script to merge data from all extracled tables and clean the data
  • Splitpdf,extratctable/py - Python scripts used to extract tables from the pdfs

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published