Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,27 @@
# image-to-csv-parser
This repository contains scripts to convert data tables in image to machine readable format

### Our Goal
We are working closely with the government of Assam to actively identify tenders related to flood preparedness and response. We hope to triangulate this information with other open datasets like satellite imagery, geospatial data, demographic data so that we can build an intelligent data ecosystem to help districts better prepare for floods in Assam.

### Available Data
We have identified a source to get the alerts for floods in Assam which are yearly distributed. Please find the reports [here](http://asdma.gov.in/alerts_details.html).
Once you select the year and then a date you will be able to access the report which is in image format.

### Recommended Steps
Go through few flood reports to understand the pattern and download the images
Next develop a scalable, robust and readable script which converts the image data tables into machine readable format.
Develop a dynamic pipeline which takes these report images as an input, converts the table into machine readable format and store it in a sql or no-sql database, based on your understanding of data modeling.


### What are we looking for
* Process: We want to understand your process. How you identified the challenge, and the solution you arrived upon. So documentation of this process would be a valuable addition.
* Skills: Skill-set with respect to converting scanned reports to machine readable format and storing it into database
* Communication: Communication is key in a remote working environment such as ours. We want to observe how you communicate not just through your work but also with us via the various channels we have.

### Timeline
1 week. Do get back to us with any questions and clarifications.

### How will CDL use the submission?
One of our key values is openness. All our code is under open licenses and in case CivicDataLab ends up using any of your work in our live solution(s), we are happy to call out the contribution in relevant channels. You are also free to make it part of your github profile.

### Co-creation & Collaboration
At CivicDataLab, we believe in collaboration and co-creation. Feel free to discuss your work with us throughout the given time period either through email or through a scheduled call. We’re more than happy to provide feedback on a continuous basis, and not just at the end of the task. In case you have any questions, don’t hesitate to ping us.