Skip to content

toludaree/classified-ads-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classified Ads Scraper

Scrape ads from ClassifiedAds.com

Requirements

  • Python (>= 3.10)

Reproducing the environment

  • Clone the repository.
    git clone https://github.com/toludaree/classified-ads-scraper.git
  • Create a python virtual environment and activate it. You can use the venv package. Name the environment .venv.
    python -m venv .venv
    
    # Activate
    .venv/Scripts/activate     # Windows
    source .venv/bin/activate  # Linux
  • Install scrapy and other associated libraries through requirements.txt
    pip install -r requirements.txt

Scrape ClassifiedAds

  • Navigate into the classifiedads directory.
    cd classifiedads
  • Choose the category or subcategory you want to scrape from ClassifiedAds.com. Here is a screenshot of all the categories and subcategories categories
  • Begin the scrapy process using the scrapy crawl command.
    scrapy crawl ads -a name=<category> -O <file-path>
    
    # category - name of subcategory that you chose from the last section
    # file-path - path to save the results of the scraping process too. It can be a JSON, CSV or an XML file.
    • For example, we might want to scrape SUV ads and save the file to suv.json.
      scrapy crawl ads -a name="SUVs" -O suv.json
    • A screenshot of the crawling session in progress Crawling in progress
    • A screenshot of the results. You can get the JSON file here SUV

About

Scrape ads from ClassifiedAds.com

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages