Skip to content

MichiganDataScienceTeam/MDST-Onboarding

Repository files navigation

MDST Tutorials - W24

Setup

  1. If you haven't already, fill out this form and join our mailing list. This will keep you up-to-date on the club.

  2. Download the files in this repo by clicking Code (the green button near the top) -> Download ZIP and unzip the files into a folder. You can of course also fork the repo if you have experience with Git.

  3. Follow the general setup guide.

  4. Complete the Git setup guide.

For most people, this is the hardest part of the tutorial! If you feel frustrated, know it is normal. Come see us at tutorials or office hours and we will help you out.

What do I do if I cannot get the setup working in time?

If you have trouble with the general setup, you can follow the Google Colab setup guide and use Colab to complete the tutorials.

You can also use deepnote or hex. For the later, you must not sign up with your umich.edu email address.

If you have trouble with the Git setup, you can upload your files to Git by going to your GitHub repository and do Add file -> Upload files.

Tutorials & Checkpoints

Get started with tutorial0 and checkpoint0 in the tutorial0 folder and then move on to tutorial1 and checkpoint1 in the tutorial1 folder. We recommend working through each tutorial before attempting the corresponding checkpoint.

The two challenges in the Optional Challenges folder are completely optional. You will find instructions about them in the submission section.

The Data-Visualization folder contains materials for those who want to get a head start. pandas.ipynb is a very brief introduction to internal Pandas data visualization tools. The AnatomyofMatplotlib folder contains a comprehensive tutorial for the Matplotlib library, which most beginner projects use and is foundational to other data visualization packages such as seaborn.

How we are supporting you

These checkpoints are not meant to be selective. Their sole purpose is to give you sufficient foundational knowledge about Python and some important packages so you can start contributing to a project.

The definition of success for us is to have everyone who begins the tutorials finish them. Thus, we will offer support in two ways:

  • Sunday Tutorials: Live tutorials will be held from 12 to 3 on 1/21 and 1/28, in-person only, at one of the fishbowl classrooms. These are the stand-alone rooms in fishbowl in Mason Hall. Tutorials will be a combination of short presentations and Q&A.

  • Weekday Office Hours: We will be offering office hours from 7 to 9 PM on 1/16, 1/23, 1/30. We will offer these in-person at the third floor of UGLI.

Neither tutorials nor office hours are mandatory.

We have also created a forum where you can ask questions.

Join the mailing list and monitor the join page for updates.

Submission

The tutorial submission form will be released soon.

In your submission, make sure to select the option saying you are new a member, and submit the link to your repository containing all your tutorial checkpoints. We are looking for:

  • [REQUIRED] checkpoint 0 and checkpoint 1. These are assessed by completion and effort, not accuracy.
  • [OPTIONAL] ML Challenge and Stats Challenge. These are assessed by merit. We usually put new members on beginning projects for their very first semester but you may want to work on advanced projects right away if you are experienced with data science. You will be able to demonstrate said experience in these two challenges. You can choose to complete one or both of them.

It is strongly recommended for you to complete at least one challenge if the project you are most interested in is labelled as an advanced project. This will give you the best chance to be placed on that team.

Contact

All technical or logistical questions MUST be posted on Piazza. We will not answer those questions over email.

If you have a personal question, email us at mdst-education@umich.edu.

Official Documentations

A list of relevent python libraries that are used extensively throughout the checkpoints, challenges, MDST projects, and beyond.

Numpy: https://numpy.org/doc/stable/

Pandas: https://pandas.pydata.org/docs/

Matplotlib: https://matplotlib.org/stable/gallery/index

Scikit-Learn: https://scikit-learn.org/stable/user_guide.html