A few months ago a friend of mine, Kidus, decided to scrape ezega.com, which is a sort of index for businesses in ethiopia. The result was a jsonl file with about 8k entries that I've been sitting on with the vague goal of looking into but not finding the time to.
Finally have time now, and while doing data analysis with clean datasets taken off the internet has been fun, it's not time to work with something dirty and real.
Note: If you just want to look at it and not clone and play with it, then just check it out here.
The dataset contains the following columns:
- business_title: The title/name of the business.
- business_image: Image associated with the business.
- business_location: Location of the business.
- business_url: URL associated with the business.
- business_description: Description of the business.
- business_numbers: Numeric data related to the business.
- ezega_url: URL from Ezega website.
- category: Category of the business.
- sub_category: Sub-category of the business.