Tasks for Interns
Update: Repo for interns (with full write access) created, click here to check out your virtual office!
Check out these tasks, and alert the group on the #pune slack channel when you're ready to take up a task. You might need to be added to a spreadsheet or similar in some tasks so contact Nikhil for that.
project | task | who's on it |
---|---|---|
PMPML | Assign UID to BRTS stops and nonBRTS stops. link | |
PMPML | Study and Analysis of GTFS specification (global standard for public transport data) link | |
Development Plan | Maps warping link | |
MSRTC | find lat-long of bus stops/stands link | |
Pune Budget | curating the published excel data, bringing it to a flat standard tabular structure form link | |
Pune electoral | Transcribing ward-wise admin, zone data from image shared by PMC and adding into the tables made. link | |
Pune electoral | gathering photos and other info of elected corporators. link |
project | task | who's on it |
---|---|---|
Misc | Check out Datazar and similar online platforms for loading data and performing R tasks collaboratively online, find limitations of free accounts | |
PMPML | Separate routes data into BRTS-side and nonBRTS-side link | |
Pune Budget | Load into a database and create queries, views. 2017-18 is to be cleaned but 2016-17 data is ready for databasing. link | |
Pune Budget | Check out standard formats shared by folks at Open Budgets India portal and create queries to adapt our data to that. link | |
Pune Budget | year on year comparisons, including comparing budgeted vs actual expenditure for years by combining different years' budget data link | |
MH Villages | finding district-wise, taluka-wise village counts etc from shapefile metadata and census data and comparing them, flagging differences link | |
MSRTC | find and publish statistics of bus stops by taluka, district etc link |
project | task | who's on it |
---|---|---|
PMPML | Mapping routes, calculating distance between consecutive stops from lat-long and flagging any routes where this distance is too great or it looks buggy on the map. link | |
MH Villages | geo-referencing / map-warping taluka pdfs from MRSAC to web-map and comparing with shapefile to detect anomalies, missing villages etc. link | |
MH Villages | tracking shapes with repeating village codes, comparing with taluka PDFs and deciding if they are to be merged, assigned different codes etc link | |
MH Villages | tracking shapes with blank village codes, comparing with taluka PDFs to figure out if they belong anywhere link | |
MH Villages | tracking shapes with blank village codes, comparing with taluka PDFs to figure out if they belong anywhere link | |
MH Villages | Map MLA constituencies to villages, so that in the metadata of every village we can find which constituency it is and then find who is the MLA etc. Publish this census code to constituency lookup table separately. link | |
MH Villages | Track villages migrated to new district formed, Palghar link |
project | task | who's on it |
---|---|---|
PMPML | GTFS feed creation link | Gaurav |
MH Villages | DIY Map Choropleth Plotter link | |
MH Villages + others | Match the following link | |
MH Villages + others | Find shape from lat-long [[link | Find shape from lat-long]] |
project | task | link |
---|---|---|
Mapping | Crowdsourcing Map-based data | link |
Misc | Data gathering for various topics where official datasets aren't publicly availabe and we can work on organically building them up. Example: urban farms, organic/zbnf farms, tree plantation sites, public schools. (For some, there may be NGOs / volunteer groups already working on them and we could collaborate with them) | |
MSRTC | find lat-long of bus stops/stands. Automating this, using address lookup, census lookup, matching with MH Villages data etc. | link |
MSRTC | Explore ways to gather routes, timetables data |
While you can take up any of the things here, or come up with tasks yourself, as part of this internship there is a minimum 1hr/week you will have to spend on any of the basic data cleaning, curating related tasks mentioned in the first table in this page.
Our starting point is from real world data rather than from some academic discipline; and to get the work done we're agnostic about exactly which way to do it. For example, we care more about getting the city's bus stops and routes data properly managed than about doing it exclusively through R or MySQL or JSON or excel etc. In our work with this kind of data, we have observed time and again that projects require working in interdisciplinary ways (and you can see that above). So one can say we have an object-oriented instead of procedural approach.
We'll leave it to you to figure out for yourself which project/task matches the methodology you want to work on, or you can take any topic and take it forward in your chosen methodology yourself. But it might also be beneficial for you to work outside of your predefined subject area in the course of this internship and focus instead on the social subject like public transit, water, public finances etc. There's many paths to the mountain-top!
- Create your free account on github if you don't already have one.
- We expect the interns to maintain detailed logs of the steps they do in the tasks they take up, and take screenshots of the most important steps.
- This logging will be done on this repo we've set up for the interns where they will have edit access. We'll leave it to you guys to sort out what goes where. You can start by going to the wiki and making a page for yourself. In case of a team working together, they need only have one document of the task.
- Github markdown syntax is damn easy and super-cool and we expect interns to learn and get acquainted with it as they log their progress. There are also tools for converting from word to markdown. For offline editors, check out Remarkable.
- Create a free account on http://imgur.com if you don't already have one. This will be where you upload all your screenshots. On uploading, you'll get a
Direct Link
URL which you'll embed in your work log. - Join the datameet slack network. It's linked on the home page. Over there (#pune channel) you can post which task you're taking up, whenever you're ready to get started.
- Stuck somewhere? Post it here: https://github.com/datameet-pune/interns/issues. It doesn't have to be about code, it can be as simple as "can you help me sort this table properly". We're all here to learn and this internship has people from different backgrounds and a variety of skillsets. By using the open forum space, we can make it possible for peers and even strangers on the web to help us.
- Use the datameet slack network, #pune channel to post queries and discuss things. Heck, start an issue and post its link on the slack channel.
- You can also join the whatsapp group (see home page for joining link) and chat there. But please keep it short over there and strictly no forwards.
- There are many volunteers in our network who can guide you in specific matters. Reach out.
PS: Are you new here? Please see our home page and the call for internships page to know what this is all about.