In [42]:
%run ../talktools.py

# Meeting/Lecture 1 - Jan 27, 2020

## Agenda for today

* Introduction / Icebreaker (this notebook, *00_Introduction*)

   * Organize into groups
   
* Class Logistics 

* Lab \#0 progress discussion

   * Report out by group

* A little break 

* Group visualization discussion (*01_plotting_and_viz_intro*)

    * Open question
  
* Gaia presentation (if time)

## Introduction / Icebreaker

* Who are you, who are the teachers?

* Let's get into groups...randomly, but deterministically...

In [13]:
students = ["Ethan", "Suchi", "Julie", "Michael", "Pancham",
            "Elias", "TJ", "Eric", "Ashley", "Mingwei", "Zack",
            "Tiffany", "Maude", "Nick", "James", "Ahmed", "Vaibhav",
            "Mark", "Abeer", "Nicholas", "Alicia", "Antonio", 
            "Erandi", "Greg", "Jessie", "Andrew C", "Evelyn", "Mel",
            "Sergiy", "Shishir", "Dhruv", "Jamie", "Danny", 
            "Yukei", "Andrew"]

In [16]:
print(f"Today we have {len(students)} students.")
print(f"There are no name conflicts: {len(students) == len(set(students))}")

Today we have 35 students.
There are no name conflicts: True


If you dont know about f-strings, today is your lucky day. They were introduced in Python 3.6 and they rock!

Some links for you:
    - https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep498
    - https://docs.python.org/3/reference/lexical_analysis.html#f-strings

In [19]:
preferred_size_of_group = 5
smallest_group_size = 3

In [20]:
print(f"With a preferred group size of {preferred_size_of_group} the smallest group would be {len(students) % preferred_size_of_group}")
print(f"That is {'fine.' if len(students) % preferred_size_of_group >= smallest_group_size else 'Unacceptable!'}")

With a preferred group size of 5 the smallest group would be 0
That is Unacceptable!


In [22]:
import numpy as np
import pandas as pd 
import scipy as sp

n_groups = int(np.ceil(len(students) / preferred_size_of_group))
print(f"Number of groups will be {n_groups}.")

Number of groups will be 7.


When using randomness, it's almost always a good idea to be able to reproduce the
results. Pseudo-random number generators are "seeded" with a number and then yield 
seemingly random numbers after that. But those numbers are deterministic given the seed
and the calling sequence of the generator.

See https://www.numpy.org/devdocs/reference/generated/numpy.random.RandomState.html

In [28]:
np.random.randint(10, 100)

72

In [38]:
class_seed = 7 # let's set this together
rnd = np.random.RandomState(class_seed)

In [39]:
import copy

shuffled_students = copy.copy(students)
rnd.shuffle(shuffled_students)
shuffled_students

['Julie',
 'Abeer',
 'Suchi',
 'Mingwei',
 'Nick',
 'Antonio',
 'Ahmed',
 'Mel',
 'Elias',
 'Danny',
 'Jamie',
 'Erandi',
 'Mark',
 'Tiffany',
 'Ethan',
 'Maude',
 'Alicia',
 'Vaibhav',
 'Andrew',
 'TJ',
 'Shishir',
 'Jessie',
 'Zack',
 'Evelyn',
 'Ashley',
 'Dhruv',
 'James',
 'Yukei',
 'Sergiy',
 'Eric',
 'Greg',
 'Nicholas',
 'Michael',
 'Andrew C',
 'Pancham']

Now make the groups...

In [40]:
for group_num, members in enumerate(np.array_split(shuffled_students,  n_groups)):
    print(f"Group {group_num}: {', '.join(list(members))}")

Group 0: Julie, Abeer, Suchi, Mingwei, Nick
Group 1: Antonio, Ahmed, Mel, Elias, Danny
Group 2: Jamie, Erandi, Mark, Tiffany, Ethan
Group 3: Maude, Alicia, Vaibhav, Andrew, TJ
Group 4: Shishir, Jessie, Zack, Evelyn, Ashley
Group 5: Dhruv, James, Yukei, Sergiy, Eric
Group 6: Greg, Nicholas, Michael, Andrew C, Pancham


## Now, a big question for your group to ponder...

*Are there more grains of sand on the Earth's beaches or stars in the observable universe?*

Let's discuss...

## Class Logistics

### Course Aims

- Introduce and motivate a range of analysis techniques and data pipelining
- Gain practical, in-depth experience doing inference on real, open-ended modern astronomical challenges
- Build reproducible, well-tested, well-documented software & infrastructure
- Learn to work with open data and code, and in an open science environment
- Hone presentation (speaking & visualization) skills
- Develop skills for future in academia, industry, …

### Communication 

* Main communication channel for us is Piazza (https://piazza.com/berkeley/spring2020/ay128256/home). Please sign up now if you haven't already.

* Course website will contain relevant links (https://ucb-datalab.github.io/), mostly pointing to the online material in GitHub (https://github.com/ucb-datalab/course-materials_2020)


### Course Format

- 4 credits (grad and undergrad)
- 1 weekly 3 hour meeting
- “Show & tell” progress reports + instructor lecture
- 4 labs
- Will require a fair amount of dedicated coding time
- It WILL fulfill the astro major lab requirement in Spring 2020
- Ugrad/Grad together/Grads will do a more in-depth labs


### Lectures

Mondays 4-7 pm in 131 Campbell Hall (no class on 1/20, 2/17, 3/23)

### Office Hours

* Josh: Wed 4-5 pm (203 Campbell)
* Dan: Tues 12-1 (311 Campbell)
* Kareem: Thurs 2-3 (419 Campbell)
* By appointment


### Grading
* 10%: Class Participation -- Active engagement in class discussion and lecture, participation during "show and tell"
* 90%: Lab Reports/Notebooks -- due before specified class, -10% for each day late, you can collaborate with people in the class, but all work, writeups, notebooks, coding, plots, etc. MUST be your own


## Lab \#0 progress discussion

In [41]:
import sys
print(f"Python: {sys.version}")
print(f"Numpy: {np.__version__}")

Python: 3.6.9 |Anaconda custom (64-bit)| (default, Jul 30 2019, 13:42:17) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
Numpy: 1.17.2
