A fast and efficient Python script to process UK postcodes data from Doogal.co.uk and extract only the essential columns for active postcodes.
- Memory Efficient: Processes large CSV files in chunks to handle datasets with millions of rows
- Fast Processing: Optimized for speed when dealing with large postcode datasets
- Filtered Output: Only keeps active postcodes (where "In Use?" = "Yes")
- Essential Columns: Extracts only the most commonly needed columns:
- Postcode
- Latitude
- Longitude
- District
- Country
- Python 3.7 or higher
- pip (Python package installer)
- Clone this repository:
git clone <repository-url>
cd postcodes- Create a virtual environment:
python -m venv venv-
Activate the virtual environment:
- Windows:
venv\Scripts\Activate.ps1
- macOS/Linux:
source venv/bin/activate
- Windows:
-
Install dependencies:
pip install -r requirements.txt- Download the UK postcodes CSV file from Doogal.co.uk
- Place the
postcodes.csvfile in the project directory - Run the processing script:
python process_postcodes.pyThe script will:
- Process the CSV file in chunks of 100,000 rows
- Filter for active postcodes only
- Extract the 5 essential columns
- Save the result to
active_postcodes.csv - Display progress updates during processing
The script generates active_postcodes.csv with the following structure:
Postcode,Latitude,Longitude,District,Country
AB1 0AA,57.101474,-2.242851,Aberdeen City,Scotland
AB1 0AB,57.102554,-2.246308,Aberdeen City,Scotland
...- Processing Speed: ~100,000 rows per chunk
- Memory Usage: Optimized for large files (2M+ rows)
- Output: Approximately 66% of postcodes are active (varies by dataset)
This project processes data from Doogal.co.uk, which provides comprehensive UK postcode information including:
- Full list of UK postcodes (active and inactive)
- Geographic coordinates
- Administrative boundaries
- Population data
This project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Doogal.co.uk for providing the comprehensive UK postcodes dataset
- The pandas library for efficient CSV processing
- The Python community for excellent data processing tools
If you encounter any issues or have questions, please:
- Check the FAQ for common solutions
- Open an issue on GitHub
- Contact the maintainers
- Initial release
- Basic postcode processing functionality
- Memory-efficient chunk processing
- Active postcode filtering