In [1]:
import pandas as pd

In [2]:
assessments = pd.read_csv("assessments.csv")
courses = pd.read_csv("courses.csv")
studentAssessment = pd.read_csv("studentAssessment.csv")
studentInfo = pd.read_csv("studentInfo.csv")
studentRegistration = pd.read_csv("studentRegistration.csv")
studentVle = pd.read_csv("studentVle.csv")
vle = pd.read_csv("vle.csv")

# assessments

- code_module : identification code of the module, to which the assessment belongs.
- code_presentation : identification code of the presentation, to which the assessment belongs.
- id_assessment : identification number of the assessment.
- assessment_type : type of assessment. Three types of assessments exist: Tutor Marked Assessment (TMA), Computer Marked Assessment (CMA) and Final Exam (Exam).
- date : information about the final submission date of the assessment calculated as the number of days since the start of the module-presentation. The starting date of the presentation has number 0 (zero).
- weight : weight of the assessment in %. Typically, Exams are treated separately and have the weight 100%; the sum of all other assessments is 100%.

![image.png](attachment:4a9f51c5-dc1b-494a-979a-62bb28e769c1.png)

In [3]:
assessments

Unnamed: 0,code_module,code_presentation,id_assessment,assessment_type,date,weight
0,AAA,2013J,1752,TMA,19.0,10.0
1,AAA,2013J,1753,TMA,54.0,20.0
2,AAA,2013J,1754,TMA,117.0,20.0
3,AAA,2013J,1755,TMA,166.0,20.0
4,AAA,2013J,1756,TMA,215.0,30.0
...,...,...,...,...,...,...
201,GGG,2014J,37443,CMA,229.0,0.0
202,GGG,2014J,37435,TMA,61.0,0.0
203,GGG,2014J,37436,TMA,124.0,0.0
204,GGG,2014J,37437,TMA,173.0,0.0


# courses

- code_module : code name of the module, which serves as the identifier.
- code_presentation : code name of the presentation. It consists of the year and “B” for the presentation starting in February and “J” for the presentation starting in October.
- module_presentation_length : length of the module-presentation in days.

![image.png](attachment:1dac1ea1-c4ae-4b8f-8dd8-63bc7d504802.png)

In [4]:
courses

Unnamed: 0,code_module,code_presentation,module_presentation_length
0,AAA,2013J,268
1,AAA,2014J,269
2,BBB,2013J,268
3,BBB,2014J,262
4,BBB,2013B,240
5,BBB,2014B,234
6,CCC,2014J,269
7,CCC,2014B,241
8,DDD,2013J,261
9,DDD,2014J,262


# studentAssessment

- id_assessment : the identification number of the assessment.
- id_student : a unique identification number for the student.
- date_submitted : the date of student submission, measured as the number of days since the start of the module presentation.
- is_banked : a status flag indicating that the assessment result has been transferred from a previous presentation.
- score : the student’s score in this assessment. The range is from 0 to 100. The score lower than 40 is interpreted as Fail. The marks are in the range from 0 to 100.

![image.png](attachment:d13d54dd-ffd8-4f55-8c05-4af844a01f25.png)

In [5]:
studentAssessment

Unnamed: 0,id_assessment,id_student,date_submitted,is_banked,score
0,1752,11391,18,0,78.0
1,1752,28400,22,0,70.0
2,1752,31604,17,0,72.0
3,1752,32885,26,0,69.0
4,1752,38053,19,0,79.0
...,...,...,...,...,...
173907,37443,527538,227,0,60.0
173908,37443,534672,229,0,100.0
173909,37443,546286,215,0,80.0
173910,37443,546724,230,0,100.0


# studentInfo

- code_module : an identification code for a module on which the student is registered.
- code_presentation : the identification code of the presentation during which the student is registered on the module.
- id_student : a unique identification number for the student.
- gender : the student’s gender.
- region : identifies the geographic region, where the student lived while taking the module-presentation.
- highest_education : highest student education level on entry to the module presentation.
- imd_band : specifies the Index of Multiple Depravation band of the place where the student lived during the module-presentation.
- age_band : band of the student’s age.
- num_of_prev_attempts : the number times the student has attempted this module.
- studied_credits : the total number of credits for the modules the student is currently studying.
- disability : indicates whether the student has declared a disability.
- final_result : student’s final result in the module-presentation.

![image.png](attachment:5ee44a67-93e0-4cf9-9824-49b8bbcbdd6f.png)

In [6]:
studentInfo

Unnamed: 0,code_module,code_presentation,id_student,gender,region,highest_education,imd_band,age_band,num_of_prev_attempts,studied_credits,disability,final_result
0,AAA,2013J,11391,M,East Anglian Region,HE Qualification,90-100%,55<=,0,240,N,Pass
1,AAA,2013J,28400,F,Scotland,HE Qualification,20-30%,35-55,0,60,N,Pass
2,AAA,2013J,30268,F,North Western Region,A Level or Equivalent,30-40%,35-55,0,60,Y,Withdrawn
3,AAA,2013J,31604,F,South East Region,A Level or Equivalent,50-60%,35-55,0,60,N,Pass
4,AAA,2013J,32885,F,West Midlands Region,Lower Than A Level,50-60%,0-35,0,60,N,Pass
...,...,...,...,...,...,...,...,...,...,...,...,...
32588,GGG,2014J,2640965,F,Wales,Lower Than A Level,10-20,0-35,0,30,N,Fail
32589,GGG,2014J,2645731,F,East Anglian Region,Lower Than A Level,40-50%,35-55,0,30,N,Distinction
32590,GGG,2014J,2648187,F,South Region,A Level or Equivalent,20-30%,0-35,0,30,Y,Pass
32591,GGG,2014J,2679821,F,South East Region,Lower Than A Level,90-100%,35-55,0,30,N,Withdrawn


# studentRegistration

- code_module : an identification code for a module.
- code_presentation : the identification code of the presentation.
- id_student : a unique identification number for the student.
- date_registration : the date of student’s registration on the module presentation, this is the number of days measured relative to the start of the module-presentation (e.g. the negative value -30 means that the student registered to module presentation 30 days before it started).
- date_unregistration : date of student unregistration from the module presentation, this is the number of days measured relative to the start of the module-presentation. Students, who completed the course have this field empty. Students who unregistered have Withdrawal as the value of the final_result column in the studentInfo.csv file.

![image.png](attachment:65e70e41-42fb-443a-aabb-c5427fc1b616.png)

In [7]:
studentRegistration

Unnamed: 0,code_module,code_presentation,id_student,date_registration,date_unregistration
0,AAA,2013J,11391,-159.0,
1,AAA,2013J,28400,-53.0,
2,AAA,2013J,30268,-92.0,12.0
3,AAA,2013J,31604,-52.0,
4,AAA,2013J,32885,-176.0,
...,...,...,...,...,...
32588,GGG,2014J,2640965,-4.0,
32589,GGG,2014J,2645731,-23.0,
32590,GGG,2014J,2648187,-129.0,
32591,GGG,2014J,2679821,-49.0,101.0


# studentVle

- code_module : an identification code for a module.
- code_presentation : the identification code of the module presentation.
- id_student : a unique identification number for the student.
- id_site : an identification number for the VLE material.
- date : the date of student’s interaction with the material measured as the number of days since the start of the - module-presentation.
- sum_click : the number of times a student interacts with the material in that day.

![image.png](attachment:5f7eaf0a-952d-42ef-8385-d12551204424.png)

In [8]:
studentVle

Unnamed: 0,code_module,code_presentation,id_student,id_site,date,sum_click
0,AAA,2013J,28400,546652,-10,4
1,AAA,2013J,28400,546652,-10,1
2,AAA,2013J,28400,546652,-10,1
3,AAA,2013J,28400,546614,-10,11
4,AAA,2013J,28400,546714,-10,1
...,...,...,...,...,...,...
10655275,GGG,2014J,675811,896943,269,3
10655276,GGG,2014J,675578,896943,269,1
10655277,GGG,2014J,654064,896943,269,3
10655278,GGG,2014J,654064,896939,269,1


# vie

- id_site : an identification number of the material.
- code_module : an identification code for module.
- code_presentation : the identification code of presentation.
- activity_type : the role associated with the module material.
- week_from : the week from which the material is planned to be used.
- week_to : week until which the material is planned to be used.

![image.png](attachment:dba1d196-67d5-4528-ac6a-a4e02b1214e6.png)

In [9]:
vle

Unnamed: 0,id_site,code_module,code_presentation,activity_type,week_from,week_to
0,546943,AAA,2013J,resource,,
1,546712,AAA,2013J,oucontent,,
2,546998,AAA,2013J,resource,,
3,546888,AAA,2013J,url,,
4,547035,AAA,2013J,resource,,
...,...,...,...,...,...,...
6359,897063,GGG,2014J,resource,,
6360,897109,GGG,2014J,resource,,
6361,896965,GGG,2014J,oucontent,,
6362,897060,GGG,2014J,resource,,
