# Environment

The dataset replicates an online learning system. An environment where, students take courses. Each course has a number of topics. Each topic can be presented in different ways to cater preferences of different students coming from diverse backgrounds & preferences. The different ways to present a topic are referred as content. There's an omniscient policy (an oracle) who knows the best way to teach every student. The student shares feedback on the content. If the content is useful, the student is taught the next topic. If the content is not useful, the policy presents the next best content. All feedback from students is recorded & used to decide the next content to be presented. 

The notebook is named after Beta distribution which is used by the oracle to select content. The omniscient knows the best content & hence does not need to explore, unlike the other approaches. However, it adjusts its choices based on the students feedback.

The decision agent, which is the contextual bandit we would train on this data, needs to learn from this dataset to create a policy which minimizes cost (negative feedback) & present content that caters to students preference, based on contexts. Contexts are students preferences on their learning style. We follow the VARK (Video , Audio, Reading, Kinesthetics) model. 

Context help choose the best actions. We have limited contextual information / features at the moment. When the system is stablized, we would increase the contextual data. The best actions are those who maximize rewards. Now, since we're creating the dataset, we're aware about it. But the bandit algorithm knows nothing about it.

# Goal : 

Generate data to train an online adaptive learning system which chooses the best content based on context to maximize rewards. The system would learn based on student feedback to present the best content. 

# Challenges 

A dataset that is not biased. One that is worthy to learn. 

# Dataset Assumptions 

* Student are modelled based on contextual information. 
* Rewards are assumed to be discrete {0,1}. Future work would be to have continuous rewards. 
* There exist content for each topic that caters to different, diverse students
     


# Story

We present a questionaire to students, which to understand their preferred way to learn new information to help them understand topics. Below are the questions.

1. Which is the effective ways for help explain concepts to you. 
    - Video / Audio / Read / Kinestictics (VARK). 

# Dataset

The dataset comprises of contextual information about the student, represented as a Bernoulli distribution (*currenty given by features about their preference for Video / Audio / Reading / Kinesthetics*). It also has contextual information about the content. Presently, the contextual information we have is whether the content is available for this topic or not. Actions / Contents have prior probability, to represents the prior beliefs of a teacher about the content. This is set randomly. For each content/action presented to the student, we receive a feedback. This feedback is represented as **cost**. The learner's objective is to minimize cost. Each content is selected based on the value returned from its Beta distribution. Every content has a Beta distribution initialized to (1,1). These are updated upon feedback from students. For positive feedback, the value returned by Beta distribution would be high, else it would be low. Student feedback is a Bernoulli distribution taking values {0,1), with probability of success is sampled from contents Beta distribution. 

Each datapoint is represented as **Contents Topic_Tag|Namespace User_Context |Namespace Content_Context**

Here *Contents* would be all available actions/arms for a topic. Only the selected content has cost (feedback) & prior probability associated with it. *Tag* identifies a data point. The *namespace* is a place holder for contextual information about the student & content. It has values *student* & *content*. Contextual information has the format *name:value* or *name* . If you only specify the *name*, then it defaults to *name:1* . Each students preference is presented as  0 or 1. For e.g: *video:1* implies the user prefers video (*because of 1*) & *reading:0* implies the student does not prefer reading (*because of 0*). Lets consider the below data point. 

4:1:0.3 5 T_1|student video:1 audio:1 reading:0 kinesthetic:1 |content C_1_1 C_1_2 

Below is a break down of the above data point:

- 4:1:0.3 5 T_1 : 
    - 4:1:0.3 -->  Content 4 was selected & returned cost of 1 (*negative feedback*). The teacher set a probability of 0.3 for this content to be selected 
    - 5 -->  The other available content for this topic. 
    - T_1 -->  The content given above was for topic T_1. 
    
- |student video:1 audio:1 reading:0 kinesthetic:1 |content C_1_1 C_1_2 
    - student -->  A tag to represent student's contextual information
    - video:1 audio:1 reading:0 kinesthetic:1 --> Student's contextual information
    - content --> A tag to represent contents contextual information.
    - C_1_1 C_1_2 --> Contents contextual information. C_1_1 represents content 4. C_1_2 represents content 5. 4 & 5 are encodings for content C_1_1 & C_1_2. This is optional. 
    
PS : | is the separator


# Algorithm used for data set generation

The algorithm can be seen as comprising of 2 parts : 

1. Setting up data : 

    - Configure variables: number of students, topic, number of contents per topic, contextual variables for student & content.
    - Create student context using a Binomial distribution over the content. 
    - Create topics & content for each topic. 
    - Create content: Initialize variables for Beta distribution to determine chances of reward from content.

2. Create an omniscient policy/oracle

    - For each student 's'
      - For each topic 't'
        - Get contents 'c' available in topic 't'
           - while we have 'c' for 't'
            - Select content having higher chance of reward. 
            - Receive feeback from student
                - if feedback is positive:
                    - Move to next topic. 
                  else 
                    - Remove content.  
                    - Stay on the same topic. 
            - Update parameters of Beta distribution for each content/arm based on feedback

# Dataset Generation

## Imports

In [231]:
import numpy as np , pandas as pd 
import os , time , copy # Copy : To deep copy python dict. As python dict copy is by reference & not by value
from scipy.stats import bernoulli
from sklearn.preprocessing import LabelEncoder

## Initialize variables

In [244]:
# These are the variables to change your sample size

number_of_students = 50 # Students taking the course. 
user_context = ['video','audio','reading','kinesthetic'] # Student preferences
number_of_topics = 10 # Number of topics in the course
content_columns = ["content_id" , "encoded" , "prior_prob" , "rewards" , "rejections"]
# no_contents_per_topic : This can be a constant or variable. Comment the one you don't want to use
# no_contents_per_topic = [4] * number_of_topics # Same number of contents per topic
no_contents_per_topic = np.random.randint(1,5,number_of_topics) # Variable number of contents per topic.


## Create Student Context Data

In [233]:
context_df = pd.DataFrame(data=np.random.binomial(1 , [0.7,0.6,0.5,0.4] , size=(number_of_students,len(user_context))) , columns = user_context)
context_df.head()

Unnamed: 0,video,audio,reading,kinesthetic
0,1,1,1,0
1,1,1,1,0
2,1,1,1,0
3,1,1,1,0
4,1,1,1,1


In [245]:
# Transform student data in sparse data format required for learning. 
features = [] # Student Context
for index, student_pref in context_df.iterrows():
    context_str = ''
    for i in range(len(user_context)):
        context_str += user_context[i] + ":" + str(student_pref[user_context[i]]) + ' '
    features.append(context_str)
print(features)

['video:1 audio:1 reading:1 kinesthetic:0 ', 'video:1 audio:1 reading:1 kinesthetic:0 ', 'video:1 audio:1 reading:1 kinesthetic:0 ', 'video:1 audio:1 reading:1 kinesthetic:0 ', 'video:1 audio:1 reading:1 kinesthetic:1 ', 'video:1 audio:1 reading:1 kinesthetic:0 ', 'video:0 audio:1 reading:1 kinesthetic:1 ', 'video:1 audio:0 reading:1 kinesthetic:0 ', 'video:0 audio:0 reading:0 kinesthetic:0 ', 'video:1 audio:0 reading:0 kinesthetic:0 ', 'video:0 audio:0 reading:0 kinesthetic:1 ', 'video:0 audio:1 reading:1 kinesthetic:1 ', 'video:1 audio:0 reading:1 kinesthetic:1 ', 'video:0 audio:1 reading:0 kinesthetic:0 ', 'video:0 audio:0 reading:1 kinesthetic:0 ', 'video:1 audio:1 reading:0 kinesthetic:0 ', 'video:1 audio:1 reading:0 kinesthetic:0 ', 'video:1 audio:1 reading:1 kinesthetic:0 ', 'video:0 audio:0 reading:1 kinesthetic:0 ', 'video:1 audio:0 reading:1 kinesthetic:0 ', 'video:1 audio:0 reading:0 kinesthetic:0 ', 'video:1 audio:0 reading:0 kinesthetic:1 ', 'video:1 audio:1 reading:0 kine

## Map content to topic

In [246]:
# Prepare topic to content mapping. 
topic_content = {} # Maps topics to content. 
all_topics = [] # Saves all topics for this course. 
all_contents = [] # Saves all content we have for the course
for i,j in enumerate(no_contents_per_topic):
    topic_id = "T_" + str(i+1) # e.g : T_10
    all_topics.append(topic_id)
    content_ids = [] # Temporary variable to help map topic to content. 
    for j_1 in range(1,j+1) : # Number of contents
        c_id = 'C_' + str(i+1) + '_' + str(j_1) # e.g : C_10_2 : Content number 2 for topics 10
        content_ids.append(c_id)
        all_contents.append(c_id)
    topic_content[topic_id] = content_ids
le = LabelEncoder().fit(all_contents)    
print('All topics : ', all_topics)
print('\n All Contents : ' , all_contents)
print('\n Contents per topic : ', topic_content)

All topics :  ['T_1', 'T_2', 'T_3', 'T_4', 'T_5', 'T_6', 'T_7', 'T_8', 'T_9', 'T_10']

 All Contents :  ['C_1_1', 'C_1_2', 'C_1_3', 'C_1_4', 'C_2_1', 'C_2_2', 'C_2_3', 'C_3_1', 'C_3_2', 'C_4_1', 'C_4_2', 'C_4_3', 'C_4_4', 'C_5_1', 'C_5_2', 'C_5_3', 'C_5_4', 'C_6_1', 'C_6_2', 'C_7_1', 'C_7_2', 'C_7_3', 'C_8_1', 'C_8_2', 'C_9_1', 'C_10_1', 'C_10_2', 'C_10_3', 'C_10_4']

 Contents per topic :  {'T_9': ['C_9_1'], 'T_1': ['C_1_1', 'C_1_2', 'C_1_3', 'C_1_4'], 'T_7': ['C_7_1', 'C_7_2', 'C_7_3'], 'T_3': ['C_3_1', 'C_3_2'], 'T_6': ['C_6_1', 'C_6_2'], 'T_10': ['C_10_1', 'C_10_2', 'C_10_3', 'C_10_4'], 'T_4': ['C_4_1', 'C_4_2', 'C_4_3', 'C_4_4'], 'T_5': ['C_5_1', 'C_5_2', 'C_5_3', 'C_5_4'], 'T_2': ['C_2_1', 'C_2_2', 'C_2_3'], 'T_8': ['C_8_1', 'C_8_2']}


## Create content

In [247]:
# Create content dataframe having : content_id , encoded form , prior probability , number of passes (reward) , number of rejections (no reward)

content_df = pd.DataFrame(columns=content_columns)
for t in topic_content.keys():
    c = topic_content[t]
    content_prob_per_topic = np.random.random(len(c)) # Teachers might have prefereces to some content, over others. These probabilities capture that. 
    content_prob_per_topic_normalized = np.round(content_prob_per_topic / sum(content_prob_per_topic) , 2)
    for i in range(len(c)):
        temp_content_item = {}
        temp_content_item["content_id"] = c[i]
        if le.transform([c[i]])[0] == 0: # VW doesn't like its actions to be encoded as 0. Hence, this hack.
            temp_content_item["encoded"] = len(all_contents)
        else:
            temp_content_item["encoded"] = le.transform([c[i]])[0]
        temp_content_item["prior_prob"] = content_prob_per_topic_normalized[i] 
        temp_content_item["rewards"] = 1 # Parameter 'a' of Beta distribution . 
        temp_content_item["rejections"] = 1 # Parameter 'b' of Beta distribution .
        temp_content_item["beta_dist_sample"] = np.random.beta(1,1)
        content_df = content_df.append(temp_content_item, ignore_index=True)
content_df.set_index("content_id" , inplace=True, verify_integrity=True)
print("Number of contents : " , len(content_df))
content_df

Number of contents :  29


Unnamed: 0_level_0,encoded,prior_prob,rewards,rejections,beta_dist_sample
content_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
C_9_1,28,1.0,1,1,0.955019
C_1_1,4,0.02,1,1,0.542714
C_1_2,5,0.46,1,1,0.079422
C_1_3,6,0.29,1,1,0.554696
C_1_4,7,0.23,1,1,0.707773
C_7_1,23,0.34,1,1,0.962667
C_7_2,24,0.39,1,1,0.090885
C_7_3,25,0.26,1,1,0.854891
C_3_1,11,0.37,1,1,0.660449
C_3_2,12,0.63,1,1,0.665315


# Oracle

In [249]:
#Oracle. 
dataset = []
for student in features:
    topic_content_copy = copy.deepcopy(topic_content)
    for t in topic_content_copy:
        content_ids = topic_content_copy[t]   
        while True and content_ids:
            content_df_beta = content_df.loc[content_ids]["beta_dist_sample"]
            selected_content = content_df_beta.idxmax() # Arm/Content selection by omniscient is based on Beta distribution
            feedback = bernoulli.rvs(size=1,p=content_df_beta[selected_content]) # Feedback from student is Bernoulli.             
            # Preparing data point for Vowpal Wabbit. 
            arms = ''
            for c in content_ids:
                if c == selected_content:
                    if feedback[0] == 1: # If we received a positive feedback, then cost is 0
                        arms += str(content_df.loc[selected_content]['encoded'])  + ":" + '0' + ":" + str(content_df.loc[selected_content]['prior_prob']) + ' '
                    else: # If we received no feedback, then cost is 1
                        arms += str(content_df.loc[selected_content]['encoded'])  + ":" + '1' + ":" + str(content_df.loc[selected_content]['prior_prob']) + ' '
                else:
                    arms += str(content_df.loc[c]['encoded']) + ' '
            arms_context = ''
            for c in content_ids:
                arms_context += c + ' '
            line = arms + t + "|student " + student + "|content " + arms_context
            print("{0}".format(line))
            dataset.append(line)
            # Update parameters based on received feedback. 
            if feedback == 0: 
                content_ids.remove(selected_content)
                content_df.loc[selected_content , "rejections"] += 1
                content_df.loc[selected_content , "beta_dist_sample"] = np.random.beta(content_df.loc[selected_content , "rewards"] , content_df.loc[selected_content , "rejections"])
            else:
                content_df.loc[selected_content , "rewards"] += 1
                content_df.loc[selected_content , "beta_dist_sample"] = np.random.beta(content_df.loc[selected_content , "rewards"] , content_df.loc[selected_content , "rejections"])
                break
# Write to file 
timestr = time.strftime("%Y%m%d-%H%M%S")
file_name = "data_" + timestr + ".dat"
file_path = os.path.join(".." , "dataset" , file_name)
with open(file_path, 'w') as f:
    for d in dataset:
        f.write(d + "\n")

4 5 6 7:0:0.23 T_1|student video:1 audio:1 reading:1 kinesthetic:0 |content C_1_1 C_1_2 C_1_3 C_1_4 
23 24 25:0:0.26 T_7|student video:1 audio:1 reading:1 kinesthetic:0 |content C_7_1 C_7_2 C_7_3 
28:0:1.0 T_9|student video:1 audio:1 reading:1 kinesthetic:0 |content C_9_1 
21:0:0.14 22 T_6|student video:1 audio:1 reading:1 kinesthetic:0 |content C_6_1 C_6_2 
29:0:0.42 1 2 3 T_10|student video:1 audio:1 reading:1 kinesthetic:0 |content C_10_1 C_10_2 C_10_3 C_10_4 
13 14 15:1:0.12 16 T_4|student video:1 audio:1 reading:1 kinesthetic:0 |content C_4_1 C_4_2 C_4_3 C_4_4 
13 14 16:1:0.12 T_4|student video:1 audio:1 reading:1 kinesthetic:0 |content C_4_1 C_4_2 C_4_4 
13 14:1:0.29 T_4|student video:1 audio:1 reading:1 kinesthetic:0 |content C_4_1 C_4_2 
13:0:0.47 T_4|student video:1 audio:1 reading:1 kinesthetic:0 |content C_4_1 
11:1:0.37 12 T_3|student video:1 audio:1 reading:1 kinesthetic:0 |content C_3_1 C_3_2 
12:1:0.63 T_3|student video:1 audio:1 reading:1 kinesthetic:0 |content C_3_2 
1

13 14 15:1:0.12 T_4|student video:1 audio:0 reading:0 kinesthetic:0 |content C_4_1 C_4_2 C_4_3 
13 14:1:0.29 T_4|student video:1 audio:0 reading:0 kinesthetic:0 |content C_4_1 C_4_2 
13:1:0.47 T_4|student video:1 audio:0 reading:0 kinesthetic:0 |content C_4_1 
11:0:0.37 12 T_3|student video:1 audio:0 reading:0 kinesthetic:0 |content C_3_1 C_3_2 
17:0:0.22 18 19 20 T_5|student video:1 audio:0 reading:0 kinesthetic:0 |content C_5_1 C_5_2 C_5_3 C_5_4 
8:1:0.2 9 10 T_2|student video:1 audio:0 reading:0 kinesthetic:0 |content C_2_1 C_2_2 C_2_3 
9 10:1:0.08 T_2|student video:1 audio:0 reading:0 kinesthetic:0 |content C_2_2 C_2_3 
9:1:0.72 T_2|student video:1 audio:0 reading:0 kinesthetic:0 |content C_2_2 
26:1:0.93 27 T_8|student video:1 audio:0 reading:0 kinesthetic:0 |content C_8_1 C_8_2 
27:0:0.07 T_8|student video:1 audio:0 reading:0 kinesthetic:0 |content C_8_2 
4 5 6 7:1:0.23 T_1|student video:0 audio:0 reading:0 kinesthetic:1 |content C_1_1 C_1_2 C_1_3 C_1_4 
4:1:0.02 5 6 T_1|student 

21 22:1:0.86 T_6|student video:1 audio:1 reading:0 kinesthetic:0 |content C_6_1 C_6_2 
21:0:0.14 T_6|student video:1 audio:1 reading:0 kinesthetic:0 |content C_6_1 
29:0:0.42 1 2 3 T_10|student video:1 audio:1 reading:0 kinesthetic:0 |content C_10_1 C_10_2 C_10_3 C_10_4 
13 14 15 16:0:0.12 T_4|student video:1 audio:1 reading:0 kinesthetic:0 |content C_4_1 C_4_2 C_4_3 C_4_4 
11:1:0.37 12 T_3|student video:1 audio:1 reading:0 kinesthetic:0 |content C_3_1 C_3_2 
12:1:0.63 T_3|student video:1 audio:1 reading:0 kinesthetic:0 |content C_3_2 
17:1:0.22 18 19 20 T_5|student video:1 audio:1 reading:0 kinesthetic:0 |content C_5_1 C_5_2 C_5_3 C_5_4 
18 19 20:1:0.31 T_5|student video:1 audio:1 reading:0 kinesthetic:0 |content C_5_2 C_5_3 C_5_4 
18 19:0:0.08 T_5|student video:1 audio:1 reading:0 kinesthetic:0 |content C_5_2 C_5_3 
8:1:0.2 9 10 T_2|student video:1 audio:1 reading:0 kinesthetic:0 |content C_2_1 C_2_2 C_2_3 
9:1:0.72 10 T_2|student video:1 audio:1 reading:0 kinesthetic:0 |content C_2_

9:1:0.72 10 T_2|student video:0 audio:0 reading:0 kinesthetic:0 |content C_2_2 C_2_3 
10:1:0.08 T_2|student video:0 audio:0 reading:0 kinesthetic:0 |content C_2_3 
26:0:0.93 27 T_8|student video:0 audio:0 reading:0 kinesthetic:0 |content C_8_1 C_8_2 
4:1:0.02 5 6 7 T_1|student video:1 audio:0 reading:0 kinesthetic:0 |content C_1_1 C_1_2 C_1_3 C_1_4 
5 6 7:1:0.23 T_1|student video:1 audio:0 reading:0 kinesthetic:0 |content C_1_2 C_1_3 C_1_4 
5 6:1:0.29 T_1|student video:1 audio:0 reading:0 kinesthetic:0 |content C_1_2 C_1_3 
5:1:0.46 T_1|student video:1 audio:0 reading:0 kinesthetic:0 |content C_1_2 
23:0:0.34 24 25 T_7|student video:1 audio:0 reading:0 kinesthetic:0 |content C_7_1 C_7_2 C_7_3 
28:0:1.0 T_9|student video:1 audio:0 reading:0 kinesthetic:0 |content C_9_1 
21 22:1:0.86 T_6|student video:1 audio:0 reading:0 kinesthetic:0 |content C_6_1 C_6_2 
21:1:0.14 T_6|student video:1 audio:0 reading:0 kinesthetic:0 |content C_6_1 
29:0:0.42 1 2 3 T_10|student video:1 audio:0 reading:0 

9:1:0.72 10 T_2|student video:1 audio:0 reading:0 kinesthetic:0 |content C_2_2 C_2_3 
10:1:0.08 T_2|student video:1 audio:0 reading:0 kinesthetic:0 |content C_2_3 
26:0:0.93 27 T_8|student video:1 audio:0 reading:0 kinesthetic:0 |content C_8_1 C_8_2 
4:0:0.02 5 6 7 T_1|student video:1 audio:1 reading:1 kinesthetic:1 |content C_1_1 C_1_2 C_1_3 C_1_4 
23:0:0.34 24 25 T_7|student video:1 audio:1 reading:1 kinesthetic:1 |content C_7_1 C_7_2 C_7_3 
28:0:1.0 T_9|student video:1 audio:1 reading:1 kinesthetic:1 |content C_9_1 
21 22:0:0.86 T_6|student video:1 audio:1 reading:1 kinesthetic:1 |content C_6_1 C_6_2 
29:0:0.42 1 2 3 T_10|student video:1 audio:1 reading:1 kinesthetic:1 |content C_10_1 C_10_2 C_10_3 C_10_4 
13 14 15:0:0.12 16 T_4|student video:1 audio:1 reading:1 kinesthetic:1 |content C_4_1 C_4_2 C_4_3 C_4_4 
11:1:0.37 12 T_3|student video:1 audio:1 reading:1 kinesthetic:1 |content C_3_1 C_3_2 
12:1:0.63 T_3|student video:1 audio:1 reading:1 kinesthetic:1 |content C_3_2 
17:0:0.22 1

26:0:0.93 27 T_8|student video:0 audio:0 reading:1 kinesthetic:0 |content C_8_1 C_8_2 
4:0:0.02 5 6 7 T_1|student video:0 audio:1 reading:0 kinesthetic:0 |content C_1_1 C_1_2 C_1_3 C_1_4 
23:0:0.34 24 25 T_7|student video:0 audio:1 reading:0 kinesthetic:0 |content C_7_1 C_7_2 C_7_3 
28:0:1.0 T_9|student video:0 audio:1 reading:0 kinesthetic:0 |content C_9_1 
21 22:1:0.86 T_6|student video:0 audio:1 reading:0 kinesthetic:0 |content C_6_1 C_6_2 
21:1:0.14 T_6|student video:0 audio:1 reading:0 kinesthetic:0 |content C_6_1 
29:0:0.42 1 2 3 T_10|student video:0 audio:1 reading:0 kinesthetic:0 |content C_10_1 C_10_2 C_10_3 C_10_4 
13 14 15:0:0.12 16 T_4|student video:0 audio:1 reading:0 kinesthetic:0 |content C_4_1 C_4_2 C_4_3 C_4_4 
11:1:0.37 12 T_3|student video:0 audio:1 reading:0 kinesthetic:0 |content C_3_1 C_3_2 
12:1:0.63 T_3|student video:0 audio:1 reading:0 kinesthetic:0 |content C_3_2 
17:0:0.22 18 19 20 T_5|student video:0 audio:1 reading:0 kinesthetic:0 |content C_5_1 C_5_2 C_5_3

13 14 15:1:0.12 16 T_4|student video:1 audio:1 reading:1 kinesthetic:1 |content C_4_1 C_4_2 C_4_3 C_4_4 
13 14 16:1:0.12 T_4|student video:1 audio:1 reading:1 kinesthetic:1 |content C_4_1 C_4_2 C_4_4 
13 14:1:0.29 T_4|student video:1 audio:1 reading:1 kinesthetic:1 |content C_4_1 C_4_2 
13:0:0.47 T_4|student video:1 audio:1 reading:1 kinesthetic:1 |content C_4_1 
11:1:0.37 12 T_3|student video:1 audio:1 reading:1 kinesthetic:1 |content C_3_1 C_3_2 
12:1:0.63 T_3|student video:1 audio:1 reading:1 kinesthetic:1 |content C_3_2 
17:0:0.22 18 19 20 T_5|student video:1 audio:1 reading:1 kinesthetic:1 |content C_5_1 C_5_2 C_5_3 C_5_4 
8:1:0.2 9 10 T_2|student video:1 audio:1 reading:1 kinesthetic:1 |content C_2_1 C_2_2 C_2_3 
9 10:1:0.08 T_2|student video:1 audio:1 reading:1 kinesthetic:1 |content C_2_2 C_2_3 
9:1:0.72 T_2|student video:1 audio:1 reading:1 kinesthetic:1 |content C_2_2 
26:1:0.93 27 T_8|student video:1 audio:1 reading:1 kinesthetic:1 |content C_8_1 C_8_2 
27:0:0.07 T_8|student

In [250]:
content_df

Unnamed: 0_level_0,encoded,prior_prob,rewards,rejections,beta_dist_sample
content_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
C_9_1,28,1.0,101,1,0.999568
C_1_1,4,0.02,57,13,0.877417
C_1_2,5,0.46,1,5,0.361015
C_1_3,6,0.29,8,7,0.58587
C_1_4,7,0.23,34,15,0.670989
C_7_1,23,0.34,69,4,0.987093
C_7_2,24,0.39,1,2,0.290001
C_7_3,25,0.26,32,4,0.883685
C_3_1,11,0.37,14,88,0.16873
C_3_2,12,0.63,2,89,0.046425
