# Group Selection: Advanced Group Project

### The following code is designed to randomly pair students together and assign them to a lecture week

---

## Approach:
##### Section I: Paring Students
- Ingest (i.e. Read-in) data from a file containing all students in IST652 M002
- Convert this data into a list of Family Names
- Initialize an empty list which will contain pairs of students
- Loop through the list (while) and pair students into groups of two.
    - The pair of students will be collected as a Tuple
    - Append the student-tuple to the list of student pairs
    - Once a student is grouped, remove them from the original list (e.g. choosing from a list without replacement)
- If there remains a lone student, then add them to an existing group, making them a group of 3.

##### Section II:  Assigning Group Numbers
- Create a list of Groups 1 through 14 (List Comprehension)
- Loop through each of the Groups and randomly assign it to a pair of students
---

In [1]:
# python ships (i.e. comes with the software) with a random number and selection package/module.
# We will be using the function called randrange from this package
import random

# Importing a CSV handling package
import csv

# pprint stands for pretty print and improves the appearance of printed values
# Notice that I am importing only a single function from pprint, instead of importing the entire package
from pprint import pprint

# Notice that I am aliasing pandas as pd - which is common practice and saves key strokes
import pandas as pd

## Section I

In [2]:
# Tell Python where to look for the CSV with the student data
# Notice no file path, this is because the CSV exists in the same place as this notebook, so it assumes the local dir
file = 'IST652_students.csv'

Open the CSV using the CSV package:
- https://docs.python.org/2/library/csv.html#csv.reader

In [3]:
# 1. Itialize the list where the data will go
# 2. Open the file
# 3. Append the CSV iterable to the studdent_data list - row by row creating a list of lists

student_data = []

with open(file, 'r') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        student_data.append(row)

# pprint instead of print
pprint(student_data)

[['Family_Name', 'First_Initial'],
 ['Bailey', 'A'],
 ['Bajpai', 'S'],
 ['Chen', 'Q'],
 ['Choi', 'B'],
 ['Dias', 'N'],
 ['Dong', 'Y'],
 ['Dunn', 'J'],
 ['Hegde', 'V'],
 ['Hwang', 'J'],
 ['Lewis', 'A'],
 ['Lo', 'J'],
 ['Nguyen', 'A'],
 ['Park', 'W'],
 ['Penaloza', 'L'],
 ['Qian', 'Y'],
 ['Rayi', 'P'],
 ['Scipione', 'V'],
 ['Sheth', 'M'],
 ['Shrivastava', 'A'],
 ['So', 'H'],
 ['Trevino', 'D'],
 ['Wang', 'M'],
 ['Wang', 'Q'],
 ['Wang', 'Y'],
 ['Wu', 'W'],
 ['Yang', 'C'],
 ['Yao', 'Y'],
 ['Zeugner', 'J']]


In [14]:
# Clean up the data formatting of each student name
# Append the first initial, followed by '. ' (period + space) to the beginning of the family name

students = [student[1] + '. '+ student[0] for student in student_data[1:]]# I am starting at index 1, 0 is the headers
pprint(students)

"""
Equivalently, I could have written this:

students = []

for student in student_data[1:]:
    student_name = student[1] + '. '+ student[0]
    students.append(student_name)
    
"""

['A. Bailey',
 'S. Bajpai',
 'Q. Chen',
 'B. Choi',
 'N. Dias',
 'Y. Dong',
 'J. Dunn',
 'V. Hegde',
 'J. Hwang',
 'A. Lewis',
 'J. Lo',
 'A. Nguyen',
 'W. Park',
 'L. Penaloza',
 'Y. Qian',
 'P. Rayi',
 'V. Scipione',
 'M. Sheth',
 'A. Shrivastava',
 'H. So',
 'D. Trevino',
 'M. Wang',
 'Q. Wang',
 'Y. Wang',
 'W. Wu',
 'C. Yang',
 'Y. Yao',
 'J. Zeugner']


"\nEquivalently, I could have written this:\n\nstudents = []\n\nfor student in student_data[1:]:\n    student_name = student[1] + '. '+ student[0]\n    students.append(student_name)\n    \n"

### Time to Pair the Students

In [5]:
# Let's test the rand.choice function
pprint(random.choice(students))

'M. Sheth'


In [6]:
# Let's make a copy of my student list so that I can safely delete values but still retain my original list
students_unpaired = students.copy()

#initiatlize a list which will contain the pairs of students (as tuples)

students_paired = []

while len(students_unpaired)>1:
    student1, student2 = '', ''
    
    while student1 == student2:
        student1 = random.choice(students_unpaired)
        student2 = random.choice(students_unpaired)

        
    pair = (student1, student2)
    students_paired.append(pair)
    indx1 = students_unpaired.index(student1)
    del students_unpaired[indx1]
    indx2 = students_unpaired.index(student2)
    del students_unpaired[indx2]

pprint(students_paired)

[('Q. Wang', 'J. Hwang'),
 ('N. Dias', 'L. Penaloza'),
 ('V. Scipione', 'A. Bailey'),
 ('S. Bajpai', 'A. Nguyen'),
 ('Y. Dong', 'V. Hegde'),
 ('C. Yang', 'A. Shrivastava'),
 ('Q. Chen', 'D. Trevino'),
 ('J. Lo', 'B. Choi'),
 ('Y. Wang', 'M. Sheth'),
 ('M. Wang', 'J. Zeugner'),
 ('P. Rayi', 'H. So'),
 ('Y. Qian', 'A. Lewis'),
 ('J. Dunn', 'W. Park'),
 ('Y. Yao', 'W. Wu')]


In [7]:
# Hooray - We have our student pairs!

## Section II

In [8]:
# Create a list of Lecture Weeks from 2 to 14 using a list comprehension:

https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

In [25]:
group_number = [x for x in range(1, 15)]

# our range data type object is from a Class which constructs a list of numbers
    #from start, end, and step-wise parameters

# We used 15 as our upper bound for the range since the integer generation stops when this value is reached

pprint(group_number)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]


In [35]:
# Let's make a copy of my student-pairs list so that I can safely delete values but still retain my original list
pairs_unassigned = students_paired.copy()

# Initialize a dictionary where the key will be the student-pair tuple and the value will be the
# number corresponding to the week they present in the syllabus
paired_groups = {}

for week in lecture_weeks:
    pair = random.choice(pairs_unassigned)
    paired_groups[pair] = week
    
    pair_indx = pairs_unassigned.index(pair)
    del pairs_unassigned[pair_indx]

In [36]:
paired_groups

{('C. Yang', 'A. Shrivastava'): 4,
 ('J. Dunn', 'W. Park'): 14,
 ('J. Lo', 'B. Choi'): 1,
 ('M. Wang', 'J. Zeugner'): 9,
 ('N. Dias', 'L. Penaloza'): 3,
 ('P. Rayi', 'H. So'): 10,
 ('Q. Chen', 'D. Trevino'): 11,
 ('Q. Wang', 'J. Hwang'): 2,
 ('S. Bajpai', 'A. Nguyen'): 6,
 ('V. Scipione', 'A. Bailey'): 12,
 ('Y. Dong', 'V. Hegde'): 5,
 ('Y. Qian', 'A. Lewis'): 13,
 ('Y. Wang', 'M. Sheth'): 8,
 ('Y. Yao', 'W. Wu'): 7}

In [37]:
# Beautify and Sort the pairs and their weeks
pprint(sorted([[paired_groups[x], x] for x in paired_groups.keys()]))

[[1, ('J. Lo', 'B. Choi')],
 [2, ('Q. Wang', 'J. Hwang')],
 [3, ('N. Dias', 'L. Penaloza')],
 [4, ('C. Yang', 'A. Shrivastava')],
 [5, ('Y. Dong', 'V. Hegde')],
 [6, ('S. Bajpai', 'A. Nguyen')],
 [7, ('Y. Yao', 'W. Wu')],
 [8, ('Y. Wang', 'M. Sheth')],
 [9, ('M. Wang', 'J. Zeugner')],
 [10, ('P. Rayi', 'H. So')],
 [11, ('Q. Chen', 'D. Trevino')],
 [12, ('V. Scipione', 'A. Bailey')],
 [13, ('Y. Qian', 'A. Lewis')],
 [14, ('J. Dunn', 'W. Park')]]


---

<div style="text-align: right"> 
## QED            