- Work as a team
- Work on NYC data
- Explore the provided data files
- Brainstorming
- Come up with a research question (Wednesday)
- Present your research question (Thursday on arrival)
- Cleaning the data
- Determine what data you need to answer this question
- Clean the data (in Python)
- Models / Visualizations / Statistics
- What types of visualizations / models / statistics do you want to provide?
- Think about the high level algorithms to do the data science work
- Go to Python to do it!
- Discuss your model and its limitations
- You may need to iterate!
- You will use sprints and pair programming DURING class!
- Deliver a presentation
- Deliver a Google Colab notebook
- Presentation in front of an audience (Friday)
- Use of story telling during the presentation
- Mentors will collect presentations and notebooks - all in Google Drive
This is recommended but you can modify to some extent. The presentations are 10-minute long (maximum).
The presentation are done in Google Docs (no other software).
- Title (in general the research question)
- Team (with picture)
- bit.ly to Google Colab Notebook
- Research question
- Description of the data
- Data science process
- High-level algorithms / steps
- Python code with comments
- Answer to the research question
- At least two graphs / charts
- Exploration or results graphs / charts
- Discussion on your experience with using real data (science and tech)
- Describe your experience in the Summer Computing Institute