2015 CrunchBase Data Export as CSV
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore Ignore virtualenv and emacs backups. Jan 23, 2016
LICENSE CC-BY-NonCommercial v4.0 Oct 4, 2015
acquisitions.csv December 2015 Export. Jan 23, 2016
additions.csv December 2015 Export. Jan 23, 2016
companies.csv December 2015 Export. Jan 23, 2016
crunchbase-csv.py
investments.csv December 2015 Export. Jan 23, 2016
readme.md December 2015 Export. Jan 23, 2016
requirements.txt Initial python conversion script. Oct 2, 2015
rounds.csv

readme.md

Crunchbase Data As CSV

This data was extracted from the December 4, 2015 Crunchbase Data Export.

This repository includes unofficial CSV exports derived from the individual worksheets from crunchbase_export.xlsx. I previously munged the data by hand with Excel, but have since moved the dirty work to python. Reading the XLSX file is handled with openpyxl while unicodecsv creates the CSVs.

The Excel workbook is transformed as follows:

  • One CSV file per worksheet
  • Skip the analysis page and empty columns
  • Remove redundant reduced precision date columns (month, quarter, year)
  • Remove dates missing a year (year 1000 is just wrong)
  • Remove trailing blank rows

Usage

virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt
python crunchbase-csv.py crunchbase_export.xlsx

License

Use of this data is governed by the CrunchBase Terms of Service and Licensing Policy.

This data dump for non-commercial use is provided under Creative Commons Attribution-NonCommercial (CC-BY-NC) license. Any commercial use requires a seperate license from CrunchBase.

crunchbase-csv.py is Copyright (c) Peter Tripp and made available under terms of the MIT License