Skip to content
svaksha edited this page Dec 15, 2012 · 31 revisions

Table of Contents

PyData Workshop-Sprint 2012 at NYC

Are you interested in a one-day hands-on intensive Pandas workshop and sprint for new contributors with a Pandas core-dev leading the sprint?

Read on...

OBJECTIVE

The aim of this workshop and sprint is to encourage and rope in more bug triagers and new contributors to scientific programming in Python, by teaching attendees about data processing tools in Python, and have them contribute a patch to Pandas.

Chang She, sprint leader

Chang She, a core-developer for the Free/Libre source Pandas library for data analysis, will be the sprint leader conducting the workshop. He is also the co-founder of LambdaFoundry, a company that provides high productivity solutions for Data Science.

Workshop-Sprint format and Timeline

It will be a small workshop to enable every attendee to ask questions freely and easily contribute to the Pandas scientific project. (Please note that this workshop and sprint is for "you" - an opportunity to boot-strap yourself into a Pandas contributor).

  • Background
Participants don't need prior experience in Pandas, but a basic understanding of Python and some kind of version control should be required. Knowledge of Numpy is a bonus but not a requirement either.
  • Groups
Registrations will be capped at 30 participants (max). As per their interest, participants will be split into groups of 3-4 people or smaller groups.

Morning Session: 0940-1300 hrs EST

Here is a tentative schedule and format for how the day will be structured into two parts:

  • 0940-0945 EST : Arrive at venue, Registration check.
  • 0945-1010 EST : Laptop setup (or meet and greet) until 1010 hrs.
  • 1010-1015 EST : Brief introduction of the event, by the organizers.
  • 1015-1230 EST : Chang will cover the following topics:
 1. Introduction to Pydata in general, 
 2. How to be more productive - IPython, editor choice, etc
 3. Introduction to Pandas, its basics:
    - object creation
    - indexing
    - computations
    - missing data
    - groupby
    - data input/output
  • 1230-1300 EST : Guided group work will have Chang showing us how to contribute to pandas:
    - Writing documentation/code, test cases, running test suite, submit pull request (the whole git workflow)

Here, people can break up into small groups of 2-4 people, with each group picking one or two Pandas github issues. They can try to make a pull request at the end. With guided group work, some of the participants should write the unit tests and some should work on the actual code.

Lunch

Afternoon Session : 1345-1715 hrs

  • 1345-1500 EST : Guided group work continues and if we have time, Chang may cover the following topics:
    - basic plotting
    - timeseries API
    - data reshaping
  • 1500-1715 EST : SPRINT time, with Chang She available as a mentor - The afternoon sprint will be "free form" and people who come can work on any Pandas tickets they like - Even if it's just writing documentation, beginners who'd like to contribute would be more than welcome. With documentation, some of the participants can work on the phrasing and others can work on examples.
  • 1715 EST : Wrap-up, and discussion of next steps.
List of Pandas Bugs

Please make a list of Pandas Bugs/Issues on Github you'd like to work on, or just find interesting.

Date & Time

Sunday, 16-December-2012. Kickstart your Sunday with a Pandas workshop and end it with a sprint in the afternoon.

Venue

Pivotal Labs has generously sponsored their office as the venue for this workshop-sprint. (ThankYou Pivotal!)

  • Address: Pivotal Labs (8th floor), 841 Broadway New York, NY.
Google-map link

REGISTRATION

The event is Free but open only to registered attendees. Space is limited and if you are interested in becoming a Pandas contributor, Register for the PyData Workshop+Sprint 2012 before Dec 12th. Closer to the event, we will RSVP all registered attendees to re-confirm their attendance. Please bring a copy of your registration or the Attendee/Order Number when you arrive at the venue.

O'Reilly Media

O'Reilly Media is generously sponsoring all our registered attendees with a free copy of the "Python for Data Analysis" E-book, written by Wes McKinney. (ThankYou O'Reilly!)

Code of Conduct

The PSF board passed a resolution last month, requiring the Code of Conduct for all events it funds. Since the event is funded by grant money - Lunch and volunteer travel expenses, this event will be under the Code of Conduct and all the registered attendees are expected to follow it.

You can read more about them in the following links:

VOLUNTEER

Besides this public wiki, we have a mailing list for volunteers that you can subscribe to.

Comments

The wiki itself is actually a git repository, which means you can clone it, edit it locally/offline, add images or any other file type, and push it back to us. It will be live immediately. Go ahead and try: $ git clone https://github.com/svaksha/PyData-Workshop-Sprint/wiki Wiki pages are normal files, with the .wiki extension. You can edit them locally, as well as creating new ones. Have fun!