<a href="https://colab.research.google.com/github/nneibaue/alumni_shuffler/blob/master/Alumni_Shuffler.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>Alumni Shuffler</h1>


Hello all! Welcome to Google Colab, this awesome tool for sharing Python notebooks in Google Drive. This has been a fun project to work on, and we hope that it is a useful! We live in a strange time, and are lucky to have technology that allows to stay connected while apart. Not only is it a tool, but this project is an example of two people in different states who have never met in person collaborating on a project together -- a project whose sole purpose is to help facilitate the virtual interaction of people from all over the country.  

<br>

---

<br>

The Alumni Shuffler is a tool that helps intelligently create breakout groups during large Zoom calls. The project is in its infancy right now, and is currently being developed by Nathan Neibauer and Hayden Blair. 

Given a virtual Zoom event with upwards of 20 or 30 people, identified by certain characterics ('track', 'year', 'hair color', 'likes _The Office_ ', etc.), this tool can help ensure that everyone can...

* Interact with everyone else in their group
* Interact with as many different people as possible

...without eating up too much mental real-estate from the coordinator, who likely wants to spend more time interacting with students and less time fussing with a spreadsheet. 

Alumni data must be saved in a spreadsheet somewhere on Google Drive for the user who is operating this notebook with columns representing the categories to sort by. An example might look like this

<br>

Name | track | year | hair_color | hard_working |
---|---|---| --- | ---
Leslie | optics | 2017 | blonde | yes
Ron | polymer | 2014 | brown | no
Jean Ralphio | semi | 2012 | black | no
April | sensors | 2019 | black | yes
Ann | semi | 2013 | brown | yes

<br>

For now, I would imagine that 'track' and 'year' are the only identifiers, but the code should work with any number of them.

## Setup

### Imports

In [0]:
import pandas as pd
import numpy as np
import random
import os
from google.colab import drive, widgets
from IPython.display import display, HTML
from itertools import combinations
drive.mount('/content/gdrive')
NAMES_DIR = '/content/gdrive/My Drive/software_development/alumni_shuffler/names'

### Definitions

In [0]:
def import_names(dir):
  name_files = [f for f in os.listdir(NAMES_DIR) if f.startswith('yob')]
  df = pd.read_csv(os.path.join(dir, name_files[0]))
  return df

def make_fake_data(max_people=40):
  df = import_names(NAMES_DIR)
  track_names = ['optics', 'semi', 'polymer', 'sensors']
  years = list(map(str, range(2013, 2020)))
  df.columns = ['name', 'year', 'track']
  df['track'] = [random.choice(track_names) for _ in range(len(df))]
  #df['track'] = ['polymer' for _ in range(len(df))] <- previously used a polymer only array for testing purposes
  df['year'] = [random.choice(years) for _ in range(len(df))]

  df = df.iloc[:max_people]  
  person_id = list(map(str,np.arange(max_people).tolist()))

  for i in person_id:
    df[i] = np.zeros(len(df))
    df[i+"_cnsctv"] = np.zeros(len(df))

  return df.iloc[:max_people]


In [0]:
def breakout(alumni, by='track'):
  groups = alumni[by].unique()
  grid = widgets.Grid(1, len(groups))
  heading = '''<center>
                <h1>%s</h1>
               </center>'''
  for i, group in enumerate(groups):
    with grid.output_to(0, i):
      display(HTML(heading % group))
      display(HTML(alumni[alumni[by] == group]._repr_html_()))

#breakout(alumni, by='track')

In [0]:
class ZoomSesh:
  '''Object that helps organize large groups of people during a zoom call.'''

  def __init__(self, filename=None, max_people=40):
    '''Constructor
    Args:
      filename: location of a file containing the alumni for this session. Columns
        TBD. If `None`, then ZoomSesh will initialize with fake data of len `max_people`.
  
      max_people: int specifying number of people for fake data. Does not do anything
        unless `filename` is None.
  
    '''
    if filename is not None:
      raise NotImplementedError('This feature is not ready yet')
    
    else:
      # For development, testing, debugging, etc.
      self._alumni_history = [make_fake_data(max_people=max_people)]

  @property
  def alumni(self):
    '''Returns 'current' alumni matrix, which is the top matrix in the stack.'''
    return self._alumni_history[0]


  def breakout(self, by, min_group_size, max_group_size, same=True, n=None):
    '''Generates a single breakout group based on the current state.
    
    Args:
      by: string identifier to use for combining alumni
      same: bool saying whether to combine alumni based on similaritis
        (same=True) or differences (same=False).
      group_size: tuple specifying range of acceptable group sizes
      n: number of subsequent breakouts. if n=None, then will return the min
        number of breakouts required for everyone to see everyone else according
        to `by` and `same`. 

    Examples:

    >>> z = ZoomSesh('file.xlsx')
    --------------------------------------------------

    Do as many breakouts as necessary to ensure that everyone of the same
    track sees each other in groups of 4 to 5

    >>> z.breakout('track', (4, 5), same=True, n=None)
    --------------------------------------------------
    Do 2 breakouts of up to 6 people per group where everyone in each group
    is from a different year.

    >>> z.breakout('year', (0, 6), same=False, n=2)
    --------------------------------------------------

    Returns:
      breakouts: dictionary of the form {'breakout_i': DataFrame}, where i is
        the breakout number.

    '''
    pass


  def _min_combo(self, alumni, by=None, arg=None, group_size=6):
    '''Creates a random group that minimizes overlap with alumni in previous breakouts.    

    Args:
      alumni: 
      by:
      arg:
      group_size:

    Returns:
      combos: a tuple of alumni indices for this group
    '''

    pass

  def _group_split(self, by, arg, min_group_size, max_group_size):
    '''Creates breakout groups based on similarities or differences of various
    alumni identifiers, such as 'track' or 'year'. 

    Args:
      by: 
      arg:
      min_group_size:
      max_group_size:
    
    Returns:
      groups: a list of tuples containing indices for the different breakout
        groups of size `group_size`
      extras: tuple containing leftover students once the groups are full

    Given an alumni matrix, a subset to select groups from (by & arg), and a group size,
    breaks the subset of alumni into as many groups as possible.
    
    Returns the list of groups and the list of alumni left over as extras.
    '''
    pass

  def summary_html(self):
    years = dict(self.alumni.year.value_counts())
    tracks = dict(self.alumni.track.value_counts())
    #css for table
    style = '''
    table {
      border: 1px solid black;
      width: auto;
      padding:5px;
    }

    .pandas-table {
      display:inline-block;
      border: 10px solid black;
      width: 20%;
      padding:10px;
    }

    th, td, {
      padding: 5px;
      text-align: left;
      border: 1px solid #ddd
    }

    .grid-container {
      display: grid;
      grid-template-columns: 25% 75%;
      background-color: #007030;
      padding: 10px;
    }

    .grid-item {
      background-color: rgba(255, 255, 255);
      border: 1px solid rgba(0, 0, 0, 0.8);
      padding: 10px;
      font-size: 12px;
      text-align: center;
      }

      .left {
        grid-column-start: 1;
        grid-column-end: 2;
      }
      .right {
        grid-column-start: 2
      }


    span.inlineTable {
      display: inline-block;
      vertical-align: text-top;
      padding: 5px;
    }
    '''
    def split_alumni():
      alumni = self.alumni[['name', 'year', 'track']]
      l = len(alumni)
      if l % 10 == 0:
        N = l // 10
      else:
        N = 1 + (l // 10)
      groups = [alumni[10*n:10*(n + 1)].to_html(classes="pandas-table", border=5) for n in range(0, N)]
      return groups

    row = lambda entry: f'<tr>{entry}</tr>'
    row_entries = lambda d: ''.join([
                    row(f'<td>{key}</td><td>{value}</td>') for key, value in d.items()])
    header = lambda attr: row(f'<th>{attr}</th><th>count</th>')

    year_table = f'''<table>
                     <caption>Years</caption>
                     {header("year")}{row_entries(years)}
                     </table>'''
    track_table = f'''<table>
                     <caption>Tracks</caption>
                     {header("track")}{row_entries(tracks)}
                     </table>'''

    html = '''
    <head>
      <style>{style}</style>
    </head>
    <body>
      <div class="grid-container">
        <div class="grid-item left">
          <h2>Total Attendees: {n}</h2>
          <span class="inlineTable">
            {year_table}
          </span>
          <span class="inlineTable">
            {track_table}
          </span>
        </div>
        <div class="grid-item right">
          {alumni}
        </div>
      </div>
    </body>'''
    return HTML(html.format(n=len(self.alumni),
                            style=style,
                            year_table=year_table,
                            track_table=track_table,
                            alumni=''.join(split_alumni())
                            ))

class ZoomSeshTest(ZoomSesh):
  '''Test Class for displaying things'''
  def breakout(self, group_size):
    num_alumni = len(self.alumni)
    df = self.alumni.sample(frac=1)[['name', 'year', 'track']]
    breakouts = [df.iloc[n:n+group_size] if n < (num_alumni - 1 - group_size) else df.iloc[n:-1] for n in np.arange(0, num_alumni, group_size)] 
    return {f'breakout_{i}': b for i, b in enumerate(breakouts)}

###Tests / Debug 

In [0]:
zt = ZoomSeshTest()
display(zt.summary_html())

In [0]:
zt = ZoomSeshTest()
breakouts = zt.breakout(7)

html = '''<center>
            <h2>{title}</h2>
            {data}
          </center>'''

grid = widgets.Grid(2, 3)
for i, b in enumerate(breakouts):
  if i <= 2:
    row = 0
    delta = 0
  else:
    row = 1
    delta = 3
  with grid.output_to(row, i-delta):
    display(HTML(html.format(title=b, data=breakouts[b]._repr_html_())))
    


## Examples and UI

In [0]:
#@title Breakout Explorer