webscraping

Environment Setup

Install latest python3
Install dependency requests. pip install requests
Install dependency beautiful soup 4for running old script. pip install beautifulsoup4

Run the script

i.e. python script.py -l 100 -d 'acne vulgaris' 'Rosacea'

Use python3 on Mac if you have both python2 and python3 installed.

required argument -d the dx_label (diagnosis results) you want to scrap, multiple inputs.

required argument -l the required number of images for every dx_label.

Note for scripts

First approach, use requests & beautiful soup 4. I will work when when search is working on website. Check old script
Second thought is Selenium. It will work with page navigation. But Selenium is too slow and hard to figure out navigation tabs / button clicks.
Finally, go with low level. Find out how those buttons work in the back end.

https://www.dermquest.com/Services/facetData.ashx // fetch metadata
https://www.dermquest.com/Services/imageData.ashx?diagnosis=109491&page=1&perPage=128  //query for specific diagnosis

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
batch_blur_detection.py		batch_blur_detection.py
result.csv		result.csv
script.py		script.py
script__old.py		script__old.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

batch_blur_detection.py

batch_blur_detection.py

result.csv

result.csv

script.py

script.py

script__old.py

script__old.py

Repository files navigation

webscraping

Environment Setup

Run the script

Note for scripts

About

Releases

Packages

Languages

ZUBOGU/webscraping

Folders and files

Latest commit

History

Repository files navigation

webscraping

Environment Setup

Run the script

Note for scripts

About

Resources

Stars

Watchers

Forks

Languages