# Keyword Analysis of CHI Best Papers 2016-2020

In this analysis, we will collect and chart the author keywords used for the Best Paper winners at the Conference on Human Factors in Computing Systems (CHI) from 2016 to 2020.
The aim of this research is to identify commonalities between those top 1% of articles honoured by the Best Paper Committee and to explore signficant trends among those.

## Reading in the data

First, we import the necessary packages and read in the keyword data from the respective file.

In [8]:
import pandas as pd

def string_to_list(list_as_string, delimiter=", ") -> [str]:
    """A simple function that splits a list input as a string by the specified delimiter and returns the list."""
    return list_as_string[0:-1].split(delimiter)

best_papers = pd.read_csv("best_papers.csv", delimiter=";", converters={"keywords": string_to_list})

print("Information about Best Papers dataframe:")
print(best_papers.info())
print()
print("Top 10 rows of Best Papers dataframe:")
print(best_papers.head())

Information about Best Papers dataframe:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   year      10 non-null     int64 
 1   title     10 non-null     object
 2   authors   10 non-null     object
 3   keywords  10 non-null     object
 4   doi       10 non-null     object
dtypes: int64(1), object(4)
memory usage: 528.0+ bytes
None

Top 10 rows of Best Papers dataframe:
   year                                              title  \
0  2020  Bug or Feature? Covert Impairments to Human Co...   
1  2020  Design Study "Lite" Methodology: Expediting De...   
2  2020             Trigeminal-based Temperature Illusions   
3  2020  Techniques for Flexible Responsive Visualizati...   
4  2020  Beyond the Prototype: Understanding the Challe...   

                                             authors  \
0                                    [John V. Monac]   
1  [Uzma

## To Do

- normalize to keywords
- lowercase
- replace abbreviations (e.g. AI)
- split "/" cases (e.g. AR/VR)
- check correlation with CCS concepts?
- number of author keywords in best papers
- american vs english forms
- leading and trailing spaces