<a href="https://colab.research.google.com/github/nneibaue/alumni_shuffler/blob/ui/Alumni_Shuffler.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>Alumni Shuffler</h1>


Hello all! Welcome to Google Colab, this awesome tool for sharing Python notebooks in Google Drive. This has been a fun project to work on, and we hope that it is a useful! We live in a strange time, and are lucky to have technology that allows to stay connected while apart. Not only is it a tool, but this project is an example of two people in different states who have never met in person collaborating on a project together -- a project whose sole purpose is to help facilitate the virtual interaction of people from all over the country.  

<br>

---

<br>

The Alumni Shuffler is a tool that helps intelligently create breakout groups during large Zoom calls. The project is in its infancy right now, and is currently being developed by Nathan Neibauer and Hayden Blair. 

Given a virtual Zoom event with upwards of 20 or 30 people, identified by certain characterics ('track', 'year', 'hair color', 'likes _The Office_ ', etc.), this tool can help ensure that everyone can...

* Interact with everyone else in their group
* Interact with as many different people as possible

...without eating up too much mental real-estate from the coordinator, who likely wants to spend more time interacting with students and less time fussing with a spreadsheet. 

Alumni data must be saved in a spreadsheet somewhere on Google Drive for the user who is operating this notebook with columns representing the categories to sort by. An example might look like this

<br>

Name | track | year | hair_color | hard_working |
---|---|---| --- | ---
Leslie | optics | 2017 | blonde | yes
Ron | polymer | 2014 | brown | no
Jean Ralphio | semi | 2012 | black | no
April | sensors | 2019 | black | yes
Ann | semi | 2013 | brown | yes

<br>

For now, I would imagine that 'track' and 'year' are the only identifiers, but the code should work with any number of them.

## Setup

### Imports

In [5]:
import pandas as pd
import numpy as np
import random
import os
import ipywidgets 
from google.colab import drive, widgets, output
from IPython.display import display, HTML
from itertools import combinations
drive.mount('/content/gdrive')
NAMES_DIR = '/content/gdrive/My Drive/software_development/alumni_shuffler/names'

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive


### Definitions

In [0]:
#@title import_names 
def import_names(dir):
  name_files = [f for f in os.listdir(NAMES_DIR) if f.startswith('yob')]
  df = pd.read_csv(os.path.join(dir, name_files[0]))
  return df

In [0]:
#@title make_fake_data
def make_fake_data(max_people=40):
  df = import_names(NAMES_DIR)
  track_names = ['optics', 'semi', 'polymer', 'sensors']
  years = list(map(str, range(2013, 2020)))
  df.columns = ['name', 'year', 'track']
  df['track'] = [random.choice(track_names) for _ in range(len(df))]
  #df['track'] = ['polymer' for _ in range(len(df))] <- previously used a polymer only array for testing purposes
  df['year'] = [random.choice(years) for _ in range(len(df))]

  df = df.iloc[:max_people]  
  person_id = list(map(str,np.arange(max_people).tolist()))

  for i in person_id:
    df[i] = np.zeros(len(df))
    df[i+"_cnsctv"] = np.zeros(len(df))

  return df.iloc[:max_people]

In [0]:
#@title ZoomSesh
class ZoomSesh:
  '''Object that helps organize large groups of people during a zoom call.'''

  def __init__(self, filename=None, max_people=40):
    '''Constructor
    Args:
      filename: location of a file containing the alumni for this session. Columns
        TBD. If `None`, then ZoomSesh will initialize with fake data of len `max_people`.
  
      max_people: int specifying number of people for fake data. Does not do anything
        unless `filename` is None.
  
    '''
    if filename is not None:
      raise NotImplementedError('This feature is not ready yet')
    
    else:
      # For development, testing, debugging, etc.
      self._alumni_history = [make_fake_data(max_people=max_people)]

  @property
  def alumni(self):
    '''Returns 'current' alumni matrix, which is the top matrix in the stack.'''
    return self._alumni_history[0]


  def breakout(self, by, group_size, diff=False, n=None):
    '''Generates a single breakout group based on the current state.
    
    Args:
      by: string identifier to use for combining alumni
      same: bool saying whether to combine alumni based on similaritis
        (same=True) or differences (same=False).
      group_size: tuple specifying range of acceptable group sizes
      n: number of subsequent breakouts. if n=None, then will return the min
        number of breakouts required for everyone to see everyone else according
        to `by` and `same`. 

    Examples:

    >>> z = ZoomSesh('file.xlsx')
    --------------------------------------------------

    Do as many breakouts as necessary to ensure that everyone of the same
    track sees each other in groups of 4 to 5

    >>> z.breakout('track', (4, 5), same=True, n=None)
    --------------------------------------------------
    Do 2 breakouts of up to 6 people per group where everyone in each group
    is from a different year.

    >>> z.breakout('year', (0, 6), same=False, n=2)
    --------------------------------------------------

    Returns:
      breakouts: dictionary of the form {'breakout_i': DataFrame}, where i is
        the breakout number.

    '''
    alumni = self.alumni
    if diff:
      return self._group_split(by, 'diff', group_size)

    elif by in list(alumni.keys()):
      all_extras = {}
      all_groups = {}
      vals = alumni[by].unique()

      for val in vals:
        extras, groups = self._group_split(by, val, group_size)
        all_extras[val] = extras
        all_groups[val] = groups

      return all_extras, all_groups

    elif by == 'all':
      return self._group_split(by, group_size)


    else:
      print("Invalid breakout input")
      return 1,1



  def _min_combo(self, alumni, by=None, arg=None, group_size=6):
    '''Creates a random group that minimizes overlap with alumni in previous breakouts.    

    Args:
      alumni: 
      by:
      arg:
      group_size:

    Returns:
      combos: a tuple of alumni indices for this group
    '''

    if by == 'all' or arg == 'diff':
      indices = alumni.index
    else:
      indices = alumni[alumni[by] == arg].index

    combos = list(combinations(indices,group_size))

    #!!! Current diff is only for year OR track. Full diff (each group member has different year and different track) not implemented yet
    if arg == 'diff':
      temp_combos = []
      for combo in combos:
        vals = alumni.loc[alumni.index.isin(combo),by]
        if len(vals) == len(set(vals)):
          temp_combos.append(combo)
      if len(temp_combos) == 0:
        end_of_split = True
        return 1    
      combos = temp_combos

    sums = []
    n = 0
    for combo in combos:
      temp_sum = 0
      n += 1

      for i in combo:
        twoD_mask = [(col,col+"_cnsctv") for col in list(map(str,combo)) if col != str(i)]
        mask = [col for sub_col in twoD_mask for col in sub_col]
        line = alumni[alumni.index == i][mask]
        temp_sum += np.sum(line.values)

      sums.append(temp_sum)


    sums = np.array(sums)
    min_list = np.where(sums == np.amin(sums))[0]
  
    return combos[random.choice(min_list)]

    

  def _group_split(self, by, arg, group_size):
    '''Creates breakout groups based on similarities or differences of various
    alumni identifiers, such as 'track' or 'year'. 

    Args:
      by: 
      arg:
      min_group_size:
      max_group_size:
    
    Returns:
      groups: a list of tuples containing indices for the different breakout
        groups of size `group_size`
      extras: tuple containing leftover students once the groups are full

    Given an alumni matrix, a subset to select groups from (by & arg), and a group size,
    breaks the subset of alumni into as many groups as possible.
    
    Returns the list of groups and the list of alumni left over as extras.
    '''

    prev_combos = []
    extras = []
    alumni = self.alumni
    end_of_split = False

    while(True):
      flat_prev_combos = [item for combo in prev_combos for item in combo]
      current_df = alumni[~alumni.index.isin(flat_prev_combos)]

      if end_of_split:
        extras = list(current_df.index)

      elif by == 'all':
        if len(current_df) < group_size:
          extras = list(current_df.index)
          end_of_split = True
        
      elif arg != 'diff':
        if len(current_df[current_df[by] == arg]) < group_size: #!!!! doesn't work for full diff
          extras = list(current_df[current_df[by] == arg].index)
          end_of_split = True


      if end_of_split:
        mask = [str(i)+"_cnsctv" for i in extras]
        alumni.loc[~alumni.index.isin(extras), mask] = 0
        break

    

      else:
        combo = self._min_combo(current_df, by=by, arg=arg, group_size=group_size)

        if combo == 1:
          pass
      
        else:
          for i in combo:
            mask = list(map(str, combo))
            mask.remove(str(i))
            mask = mask + [index+"_cnsctv" for index in mask]
            alumni.loc[alumni.index == i, mask] += 1

          mask = [str(i)+"_cnsctv" for i in combo]
          alumni.loc[~alumni.index.isin(combo), mask] = 0 

          prev_combos.append(combo)

    return extras, prev_combos


  #TODO: Use HtmlMaker class for this function. Then we can compose different
  #      HtmlMakers by overloading `__add__` e.g. 
  def summary_html(self):
    years = dict(self.alumni.year.value_counts())
    tracks = dict(self.alumni.track.value_counts())

    #css for table
    style = '''
    table {
      border: 1px solid black;
      width: auto;
      padding:5px;
    }

    .pandas-table {
      display:inline-block;
      border: 10px solid black;
      width: 20%;
      padding:10px;
    }

    th, td, {
      padding: 5px;
      text-align: left;
      border: 1px solid #ddd
    }

    .grid-container {
      display: grid;
      grid-template-columns: 25% 75%;
      background-color: #007030;
      padding: 10px;
    }

    .grid-item {
      background-color: rgba(255, 255, 255);
      border: 1px solid rgba(0, 0, 0, 0.8);
      padding: 10px;
      font-size: 12px;
      text-align: center;
      }

      .left {
        grid-column-start: 1;
        grid-column-end: 2;
      }
      .right {
        grid-column-start: 2
      }


    span.inlineTable {
      display: inline-block;
      vertical-align: text-top;
      padding: 5px;
    }
    '''
    def split_alumni():
      alumni = self.alumni[['name', 'year', 'track']]
      l = len(alumni)
      if l % 10 == 0:
        N = l // 10
      else:
        N = 1 + (l // 10)
      groups = [alumni[10*n:10*(n + 1)].to_html(classes="pandas-table", border=5) for n in range(0, N)]
      return groups

    row = lambda entry: f'<tr>{entry}</tr>'
    row_entries = lambda d: ''.join([
                    row(f'<td>{key}</td><td>{value}</td>') for key, value in d.items()])
    header = lambda attr: row(f'<th>{attr}</th><th>count</th>')

    year_table = f'''<table>
                     <caption>Years</caption>
                     {header("year")}{row_entries(years)}
                     </table>'''
    track_table = f'''<table>
                     <caption>Tracks</caption>
                     {header("track")}{row_entries(tracks)}
                     </table>'''

    html = '''
    <head>
      <style>{style}</style>
    </head>
    <body>
      <div class="grid-container">
        <div class="grid-item left">
          <h2>Total Attendees: {n}</h2>
          <span class="inlineTable">
            {year_table}
          </span>
          <span class="inlineTable">
            {track_table}
          </span>
        </div>
        <div class="grid-item right">
          {alumni}
        </div>
      </div>
    </body>'''
    return html.format(n=len(self.alumni),
                       style=style,
                       year_table=year_table,
                       track_table=track_table,
                       alumni=''.join(split_alumni())
                       )

In [0]:
#@title ZoomSeshTest
class ZoomSeshTest(ZoomSesh):
  '''Test Class for displaying things'''
  def breakout(self, group_size):
    num_alumni = len(self.alumni)
    df = self.alumni.sample(frac=1)[['name', 'year', 'track']]
    breakouts = [df.iloc[n:n+group_size] if n < (num_alumni - 1 - group_size) else df.iloc[n:-1] for n in np.arange(0, num_alumni, group_size)] 
    return_dict = {}
    for i, b in enumerate(breakouts):
      if len(b) == group_size:
        return_dict[f'group{i}'] = np.array(b.index)
      else:
        return_dict['extras'] = np.array(b.index)
    return return_dict

In [0]:
#@title HtmlMaker 

class HtmlMaker:
  default_style = {
        'table': {
          'border-spacing':'0px',
        },
  }
  
  def __init__(self, use_default_style=True):
    self._elements = []
    self.style={}

    if use_default_style:
      self.apply_style(HtmlMaker.default_style.copy())

  
  def _to_css_style(self):
    if not self.style:
      return ''
    css_elements = []
    css_element_template = ('{classname} {{\n  {prop_list}\n  }}\n')
    for classname in self.style:
      inner = self.style[classname]
      prop_list = '\n  '.join(
          [f'{prop_name}: {prop_value};' for prop_name, prop_value in inner.items()])
      css_elements.append(css_element_template.format(
          classname=classname,
          prop_list=prop_list
      ))
    css = ''.join(css_elements)
    return f'<style>\n{css}\n</style>'

  def apply_style(self, style_dict, merge=True):
    '''Applies a css style dictionary to HtmlMaker object.
  
    Args:
      style_dict: dictionary containing css classes and valid css properties. 
      merge: whether to merge with existing style or replace it.
  
      Example: 
      >>>default_style = {
      >>>    'div.bluebox': {
      >>>      'border': '2px solid blue',
      >>>      'padding': '10px',
      >>>    },
      >>>    'table': {
      >>>      'border-spacing':'0px',
      >>>    },
      >>>    'div.horizontal': {
      >>>      'display': 'inline-table',
      >>>      'border': '2px solid green',
      >>>      'padding': '10px',
      >>>  }
      >>>}
    '''
    self.style.update(style_dict)
        
        
  def add_pandas_df(self, df, td_class="", title=None, enclosing_tag=None, css_classes=None):
    def get_row(row):
      entries = '' 
      for col in df.columns:
        entries += f'<td class="{td_class}">{row[col]}</td>'
      return f'<tr>{entries}</tr>'

    #body = df.apply(lambda row: '<tr>' + ''.join([f'<td>{row[col]}<td>' for col in df.columns]) + '</tr>', axis=1).values.astype(str)
    body = df.apply(get_row, axis=1)
    body = ''.join(body.values.astype(str))
    if not title:
      title = ''
    else:
      title = f'<h2>{title}</h2>'
    html = (
        f'''
        {title}
        <table>
          {body}
        </table>
        '''
    )

    self.add_html_element(html,
                          enclosing_tag=enclosing_tag,
                          css_classes=css_classes)


  def add_html_element(self, data,
                       enclosing_tag=None,
                       css_classes=None,
                       insert_at_front=False):
    '''Adds a generic html element to the HtmlMaker

    Args: 
      data: html string to add
      enclosing_tag: optional tag to wrap html
      css_classes: optional list of strings. Which css classes to add to `tag`.
        These muse be the tag identifier without the brackets ('<' and '>')

    Example
    ```
    >>> m = HtmlMaker()
    >>> m.add_html_element("I'm inside a tag!",
                           enclosing_tag='div', 
                           css_classes=["fat-box", "output-area"])
    ```
    '''
    if enclosing_tag is not None:
      if ('<' in enclosing_tag) or ('>' in enclosing_tag):
        raise ValueError('Brackets must not be included. Use tag name by itself.')

      if css_classes is not None:
        assert isinstance(css_classes, list)
        front = f'<{enclosing_tag} class="{" ".join(css_classes)}">'
        back = f'</{enclosing_tag}>'
      
      else:
        front = f'<{enclosing_tag}>'
        back = f'</{enclosing_tag}>'

    else:
      front = ''
      back = ''
    html = f'{front}{data}{back}'

    if insert_at_front:
      self._elements.insert(0, html)
    else:
      self._elements.append(html)

  def to_html(self):
    # Add css
    self._elements.insert(0, self._to_css_style())
    html = (''.join(self._elements))
    return HTML(html)


###Tests / Debug 

In [54]:
display(HTML(ZoomSeshTest().summary_html()))

year,count
2016,7
2015,6
2019,6
2014,6
2017,6
2018,5
2013,4

track,count
sensors,14
semi,12
optics,9
polymer,5

Unnamed: 0,name,year,track
0,Anna,2013,sensors
1,Emma,2019,semi
2,Elizabeth,2016,optics
3,Minnie,2016,optics
4,Margaret,2017,sensors
5,Ida,2015,sensors
6,Alice,2015,semi
7,Bertha,2018,polymer
8,Sarah,2016,sensors
9,Annie,2016,optics

Unnamed: 0,name,year,track
10,Clara,2017,semi
11,Ella,2019,sensors
12,Florence,2018,semi
13,Cora,2015,optics
14,Martha,2019,sensors
15,Laura,2014,sensors
16,Nellie,2017,semi
17,Grace,2017,semi
18,Carrie,2017,optics
19,Maude,2016,semi

Unnamed: 0,name,year,track
20,Mabel,2013,sensors
21,Bessie,2018,polymer
22,Jennie,2019,semi
23,Gertrude,2015,sensors
24,Julia,2019,optics
25,Hattie,2017,sensors
26,Edith,2016,sensors
27,Mattie,2015,sensors
28,Rose,2014,semi
29,Catherine,2016,semi

Unnamed: 0,name,year,track
30,Lillian,2018,optics
31,Ada,2019,optics
32,Lillie,2015,sensors
33,Helen,2013,polymer
34,Jessie,2014,polymer
35,Louise,2014,optics
36,Ethel,2014,polymer
37,Lula,2014,sensors
38,Myrtle,2018,semi
39,Eva,2013,semi


In [55]:
m = HtmlMaker()
m.apply_style({
    'div.bluebox-test': {
        'border': '2px solid blue',
        'width': '30%',
    }
})
m.add_html_element('Hello I am in a bluebox div! Rock Solid!',
                   enclosing_tag='div',
                   css_classes=['bluebox-test'])
m.to_html()

## UI

In [0]:
#@title make_breakout_ui
def make_breakout_ui(z):
  breakout_output = ipywidgets.Output()

  # Callback for all buttons
  def perform_breakout(n, color):
    maker = HtmlMaker()
    breakouts = zt.breakout(n)
    cols = ['name', 'year', 'track']
    df_style = {'div.horizontal-table': {
               'display': 'inline-table',
               'padding': '1px'}
    }
    maker.apply_style(df_style)
      
    inner_maker = HtmlMaker(use_default_style=False)
    for group in breakouts:
      inner_maker.add_pandas_df(zt.alumni.iloc[breakouts[group]][cols],
                                td_class="",
                                title=group,
                                enclosing_tag='div',
                                css_classes=["horizontal-table"])

    maker.apply_style({
        f'div.{color}box': {
            'border': f'2px solid {color}',
            'padding': '10px',
        }
    })
    maker.add_html_element(inner_maker.to_html().data,
                           enclosing_tag='div',
                           css_classes=[f'{color}box'])

    #maker.apply_style({'table': {'display': 'inline-table'}})
    with breakout_output:
      display(maker.to_html())

  # Button maker
  def _button(n, color):
    button = ipywidgets.Button(
        description=f'Groups of {n}',
        background_color='#ddd',
        layout=ipywidgets.Layout(
            width='20%',
            height='30px',
            border=f'2px solid {color}',
        )
    )
    button.on_click(lambda b: perform_breakout(n, color))
    return button


  colors=['orange', 'green', 'blue', 'salmon']
  group_sizes = [3, 4, 5, 6]
  buttons = [_button(n, c) for n, c in zip(group_sizes, colors) ]
  box = ipywidgets.VBox([ipywidgets.HTML(zt.summary_html()),
                        ipywidgets.HBox(buttons),
                        breakout_output])

  
  display(box)

In [57]:
make_breakout_ui(ZoomSeshTest())

VBox(children=(HTML(value='\n    <head>\n      <style>\n    table {\n      border: 1px solid black;\n      wid…