This repository contains Python scripts to scrape property listings from multiple Malawian real estate websites and save the data to CSV files. It also includes a Jupyter notebook for data analysis and visualization.
Scrapes property listings from the Atsogo website (https://atsogo.mw/listings/properties).
Scrapes property listings from multiple Malawian property websites:
- Atsogo (https://atsogo.mw)
- SGW Auctioneers and Estate Agents (https://sgw.mw)
- Nyumba24 (https://www.nyumba24.com)
- Knight Frank (https://www.knightfrank.mw) - Basic support
- Reynolds (https://reynolds.mw) - Basic support
- 4321 Property (https://www.4321property.com/malawi) - Basic support
- Scrapes property details including title, type, location, price, area, bedrooms, bathrooms, and posting date
- Handles pagination to scrape multiple pages
- Includes error handling and logging
- Respectful scraping with delays between requests
- Saves data to CSV format
- Multi-site support with unified data format
- Jupyter notebook for data analysis and visualization
- Make sure you have Python 3.6+ installed
- Install the required dependencies:
pip install -r requirements.txtRun the Atsogo scraper to collect all available properties:
python atsogo_scraper.pyRun the multi-site scraper to collect properties from all supported websites:
python malawi_property_scraper.pyOpen the Jupyter notebook LilongwePropertyAnalysis.ipynb to explore and visualize the property data. The notebook includes:
- Importing necessary packages (pandas, numpy, matplotlib, seaborn)
- Loading the CSV data
- Splitting the location column into 'city' and 'area'
- Extracting the month name from the date column
- Dropping the original location column
- Exploratory data analysis with summary tables and visualizations
Creates atsogo_properties.csv with the following columns:
title: Property title/nameproperty_type: Type of property (Plot, Complete House, Land, etc.)transaction_type: For Sale or For Rentlocation: Property location (city and area)price: Property price in Malawian Kwacha (MK)area_sqm: Area in square metersbedrooms: Number of bedroomsbathrooms: Number of bathroomsdate_posted: Date when the property was posteddescription: Property description (if available)- city: Extracted city from location (added in notebook)
- area: Extracted area from location (added in notebook)
- month: Month name extracted from date_posted (added in notebook)
Creates malawi_properties.csv with additional columns:
source: Website source (atsogo, sgw, nyumba24, etc.)url: Source URL
Based on the website content, the scrapers will extract data like:
- Plot in Area 41 (LILONGWE, Area 41) - MK 85,000,000.00
- Furnished House (LILONGWE, New Area 43) - MK 2,000.00 (For rent)
- Plot for sale (LILONGWE, Area 46) - MK 115,000,000.00 (2100 sqm)
- Mandala House (Blantyre) - MK 1,500,000 (For rent)
- 308 Hectare Farm (Mangochi) - MK 560,000,000.00 (For sale)
- Atsogo - Complete property listings with detailed information
- SGW - Homepage property listings
- Nyumba24 - Property listings from homepage
- Knight Frank - Basic property detection
- Reynolds - Basic property detection
- 4321 Property - Basic property detection
- The scripts include delays between page requests to be respectful to the servers
- Error handling is included for network issues and parsing problems
- The scripts use realistic User-Agents to avoid being blocked
- Logging is enabled to track the scraping progress
- Each website may have different data availability and structure
Please ensure you comply with each website's terms of service and robots.txt file when using these scrapers. These tools are for educational purposes only.
requests: For making HTTP requestsbeautifulsoup4: For parsing HTML contentlxml: XML/HTML parser backend for BeautifulSouppandas,numpy,matplotlib,seaborn: For data analysis and visualization in the notebook
- Add support for more property websites
- Implement more sophisticated data extraction for basic support sites
- Add data validation and cleaning
- Create a web interface for the scrapers
- Add support for property images and additional details