## Pisa Data 2012 
This is an analysis for the PISA 2012 dataset for data analyst nanodegree of Udacity.
In this analysis we are going to explore and clean the original PISA dataset and in the next notebook, we will present the findings.

The PISA database contains the full set of responses from individual students, school principals and parents. PISA is a survey of students' skills and knowledge as they approach the end of compulsory education. It is not a conventional school test. Rather than examining how well students have learned the school curriculum, it looks at how well prepared they are for life beyond school.

Around 510,000 students in 65 economies took part in the PISA 2012 assessment of reading, mathematics and science representing about 28 million 15-year-olds globally. Of those economies, 44 took part in an assessment of creative problem solving and 18 in an assessment of financial literacy.

In this dataset we have 485490 students and 18 selected rows.
The original dataset had 640 columns.
We chose the ones that we were interested in and create a new dataset.
Most of the columns have categorical variables.

### What is the structure of your dataset?

* In this dataset we have 485490 students and 18 selected rows. 
* The original dataset had 640 columns. 
* We chose the ones that we were interested in and create a new dataset.
* Most of the columns have categorical variables.

### What is/are the main feature(s) of interest in your dataset?

* In this dataset we will explore the scores in math, science, reading and average scores of the students.
* Also we will explore what the scores depend on.

### What features in the dataset do you think will help support your investigation into your feature(s) of interest?
In this analysis we will use the following features of the dataset:

* Country                         
* Gender                          
* late_for_sch                    
* possessions_room                
* possessions_literature            
* math_imprtnt_future_student     
* math_imprtnt_future_parents     
* possessions_computer            
* class_size                         
* teach_diff_task_to_diff_stud    
* teach_gives_feedback            
* stud_belong_at_sch              
* stud_give_up_easy               
* math_score                       
* reading_score                    
* science_score                    
* ave_score                        
* possessions_litterature         

Most of the variables are categorical.
<br>Those variables will help us with the analysis.

In [44]:
import requests
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [2]:
df = pd.read_csv('pisa2012.csv', sep=',',  encoding='latin-1')

  interactivity=interactivity, compiler=compiler, result=result)


In [3]:
df['CNT']

0         Albania
1         Albania
2         Albania
3         Albania
4         Albania
           ...   
485485    Vietnam
485486    Vietnam
485487    Vietnam
485488    Vietnam
485489    Vietnam
Name: CNT, Length: 485490, dtype: object

As we can see, this dataset is really big. In order to work on it we will need to drop columns and keep the ones we are interested in.

In [4]:
df.shape

(485490, 636)

In [5]:
df.CNT.unique()

array(['Albania', 'United Arab Emirates', 'Argentina', 'Australia',
       'Austria', 'Belgium', 'Bulgaria', 'Brazil', 'Canada',
       'Switzerland', 'Chile', 'Colombia', 'Costa Rica', 'Czech Republic',
       'Germany', 'Denmark', 'Spain', 'Estonia', 'Finland', 'France',
       'United Kingdom', 'Greece', 'Hong Kong-China', 'Croatia',
       'Hungary', 'Indonesia', 'Ireland', 'Iceland', 'Israel', 'Italy',
       'Jordan', 'Japan', 'Kazakhstan', 'Korea', 'Liechtenstein',
       'Lithuania', 'Luxembourg', 'Latvia', 'Macao-China', 'Mexico',
       'Montenegro', 'Malaysia', 'Netherlands', 'Norway', 'New Zealand',
       'Peru', 'Poland', 'Portugal', 'Qatar', 'China-Shanghai',
       'Perm(Russian Federation)', 'Florida (USA)', 'Connecticut (USA)',
       'Massachusetts (USA)', 'Romania', 'Russian Federation',
       'Singapore', 'Serbia', 'Slovak Republic', 'Slovenia', 'Sweden',
       'Chinese Taipei', 'Thailand', 'Tunisia', 'Turkey', 'Uruguay',
       'United States of America', 'Vietn

In [6]:
#Check the Countries values
df['CNT'].value_counts()

Mexico                      33806
Italy                       31073
Spain                       25313
Canada                      21544
Brazil                      19204
                            ...  
Florida (USA)                1896
Perm(Russian Federation)     1761
Massachusetts (USA)          1723
Connecticut (USA)            1697
Liechtenstein                 293
Name: CNT, Length: 68, dtype: int64

In [7]:
#Check if any rows are duplitcated
df.duplicated().any()

False

In [8]:
#Check all the countries values
df['CNT'].value_counts()

Mexico                      33806
Italy                       31073
Spain                       25313
Canada                      21544
Brazil                      19204
                            ...  
Florida (USA)                1896
Perm(Russian Federation)     1761
Massachusetts (USA)          1723
Connecticut (USA)            1697
Liechtenstein                 293
Name: CNT, Length: 68, dtype: int64

Student Information

* CNT: Country
* ST04Q01: Gender
* Student Mathematics Scores

* PV1MATH: Overall Mathematics Score
Student's Attitude Towards Mathematics

Instrumental Motivation
* ST29Q02: Worthwhile for Work
* ST29Q05: Worthwhile for Career Chances
* ST29Q07: Important for Future Study
* ST29Q08: Helps to Get a Job
Math Anxiety

ST42Q01: Worry That It Will Be Difficult
ST42Q03: Get Very Tense
ST42Q05: Get Very Nervous
ST42Q08: Feel Helpless
ST42Q10: Worry About Getting Poor
Math Self-Concept

ST42Q02: Not Good at Maths
ST42Q04: Get Good (Grades)
ST42Q06: Learn Quickly
ST42Q07: One of Best Subjects
ST42Q09: Understand Difficult Work
Math Interest

ST29Q01: Enjoy Reading
ST29Q03: Look Forward to Lessons
ST29Q04: Enjoy Maths
ST29Q06: Interested
Math Work Ethic

ST46Q01: Homework Completed in Time
ST46Q02: Work Hard on Homework
ST46Q03: Prepared for Exams
ST46Q04: Study Hard for Quizzes
ST46Q05: Study Until I Understand Everything
ST46Q06: Pay Attention in Classes
ST46Q07: Listen in Classes
ST46Q08: Avoid Distractions When Studying
ST46Q09: Keep Work Organized
Math Behaviour

ST49Q01: Talk about Maths with Friends
ST49Q02: Help Friends with Maths
ST49Q03: (Extracurricular) Activity
ST49Q04: Participate in Competitions
ST49Q05: Study More Than 2 Extra Hours a Day
ST49Q06: Play Chess
ST49Q07: Computer programming
ST49Q09: Participate in Math Club
Subjective Norms

ST35Q04: Parents Believe Studying Mathematics Is Important
ST35Q05: Parents Believe Mathematics Is Important for Career
ST35Q06: Parents Like Mathematics


In [9]:
df.duplicated().any()

False

In [10]:
#organize features by category list

#CNT: country, ST04Q01: gender
student_info=['CNT','ST04Q01']

#PV1MATH: Overall Math Score
scores=['PV1MATH']

#ST29Q02: Worthwhile for Work, ST29Q05: Worthwhile for Career Chances, ST29Q07: Important for Future Study
#ST29Q08: Helps to Get a Job
motivation=['ST29Q02','ST29Q05','ST29Q07','ST29Q08']

#ST42Q01: Worry That It Will Be Difficult, ST42Q03: Get Very Tense, ST42Q05: Get Very Nervous
#ST42Q08: Feel Helpless, ST42Q10: Worry About Getting Poor <Grades>
anxiety=['ST42Q01','ST42Q03','ST42Q05','ST42Q08','ST42Q10']

#ST42Q02: Not Good at Maths, ST42Q04: Get Good (Grades), ST42Q06: Learn Quickly
#ST42Q07: One of Best Subjects, ST42Q09: Understand Difficult Work
self=['ST42Q02','ST42Q04','ST42Q06','ST42Q07','ST42Q09']

#ST29Q01: Enjoy Reading, ST29Q03: Look Forward to Lessons, ST29Q04: Enjoy Maths, ST29Q06: Interested
interest=['ST29Q01','ST29Q03','ST29Q04','ST29Q06']

#ST46Q01: Homework Completed in Time, ST46Q02: Work Hard on Homework, ST46Q03: Prepared for Exams
#ST46Q04: Study Hard for Quizzes, ST46Q05: Study Until I Understand Everything, ST46Q06: Pay Attention in Classes
#ST46Q07: Listen in Classes, ST46Q08: Avoid Distractions When Studying, ST46Q09: Keep Work Organized
work_ethic= ['ST46Q01','ST46Q02','ST46Q03','ST46Q04','ST46Q05','ST46Q06','ST46Q07','ST46Q08','ST46Q09']

#ST49Q01: Talk about Maths with Friends,ST49Q02: Help Friends with Maths, ST49Q03: (Extracurricular) Activity
#ST49Q04: Participate in Competitions, ST49Q05: Study More Than 2 Extra Hours a Day, ST49Q06: Play Chess
#ST49Q07: Computer programming, ST49Q09: Participate in Math Club
behavior=['ST49Q01','ST49Q02','ST49Q03','ST49Q04','ST49Q05','ST49Q06','ST49Q07','ST49Q09']

#ST35Q04: Parents Believe Studying Mathematics Is Important
#ST35Q05: Parents Believe Mathematics Is Important for Career
#ST35Q06:Parents Like Mathematics
parents=['ST35Q04','ST35Q05','ST35Q06']

In [11]:
df['CNT'].value_counts()

Mexico                      33806
Italy                       31073
Spain                       25313
Canada                      21544
Brazil                      19204
                            ...  
Florida (USA)                1896
Perm(Russian Federation)     1761
Massachusetts (USA)          1723
Connecticut (USA)            1697
Liechtenstein                 293
Name: CNT, Length: 68, dtype: int64

In [12]:
df['CNT'].value_counts()[['United States of America','Florida (USA)','Massachusetts (USA)','Connecticut (USA)']].sum()

10294

In [13]:
df[scores].head()

Unnamed: 0,PV1MATH
0,406.8469
1,486.1427
2,533.2684
3,412.2215
4,381.9209


In [32]:
df.head(1)

Unnamed: 0.1,Unnamed: 0,CNT,SUBNATIO,STRATUM,OECD,NC,SCHOOLID,STIDSTD,ST01Q01,ST02Q01,ST03Q01,ST03Q02,ST04Q01,ST05Q01,ST06Q01,ST07Q01,ST07Q02,ST07Q03,ST08Q01,ST09Q01,ST115Q01,ST11Q01,ST11Q02,ST11Q03,ST11Q04,ST11Q05,ST11Q06,ST13Q01,ST14Q01,ST14Q02,ST14Q03,ST14Q04,ST15Q01,ST17Q01,ST18Q01,ST18Q02,ST18Q03,ST18Q04,ST19Q01,ST20Q01,ST20Q02,ST20Q03,ST21Q01,ST25Q01,ST26Q01,ST26Q02,ST26Q03,ST26Q04,ST26Q05,ST26Q06,ST26Q07,ST26Q08,ST26Q09,ST26Q10,ST26Q11,ST26Q12,ST26Q13,ST26Q14,ST26Q15,ST26Q16,ST26Q17,ST27Q01,ST27Q02,ST27Q03,ST27Q04,ST27Q05,ST28Q01,ST29Q01,ST29Q02,ST29Q03,ST29Q04,ST29Q05,ST29Q06,ST29Q07,ST29Q08,ST35Q01,ST35Q02,ST35Q03,ST35Q04,ST35Q05,ST35Q06,ST37Q01,ST37Q02,ST37Q03,ST37Q04,ST37Q05,ST37Q06,ST37Q07,ST37Q08,ST42Q01,ST42Q02,ST42Q03,ST42Q04,ST42Q05,ST42Q06,ST42Q07,ST42Q08,ST42Q09,ST42Q10,ST43Q01,ST43Q02,ST43Q03,ST43Q04,ST43Q05,ST43Q06,ST44Q01,ST44Q03,ST44Q04,ST44Q05,ST44Q07,ST44Q08,ST46Q01,ST46Q02,ST46Q03,ST46Q04,ST46Q05,ST46Q06,ST46Q07,ST46Q08,ST46Q09,ST48Q01,ST48Q02,ST48Q03,ST48Q04,ST48Q05,ST49Q01,ST49Q02,ST49Q03,ST49Q04,ST49Q05,ST49Q06,ST49Q07,ST49Q09,ST53Q01,ST53Q02,ST53Q03,ST53Q04,ST55Q01,ST55Q02,ST55Q03,ST55Q04,ST57Q01,ST57Q02,ST57Q03,ST57Q04,ST57Q05,ST57Q06,ST61Q01,ST61Q02,ST61Q03,ST61Q04,ST61Q05,ST61Q06,ST61Q07,ST61Q08,ST61Q09,ST62Q01,ST62Q02,ST62Q03,ST62Q04,ST62Q06,ST62Q07,ST62Q08,ST62Q09,ST62Q10,ST62Q11,ST62Q12,ST62Q13,ST62Q15,ST62Q16,ST62Q17,ST62Q19,ST69Q01,ST69Q02,ST69Q03,ST70Q01,ST70Q02,ST70Q03,ST71Q01,ST72Q01,ST73Q01,ST73Q02,ST74Q01,ST74Q02,ST75Q01,ST75Q02,ST76Q01,ST76Q02,ST77Q01,ST77Q02,ST77Q04,ST77Q05,ST77Q06,ST79Q01,ST79Q02,ST79Q03,ST79Q04,ST79Q05,ST79Q06,ST79Q07,ST79Q08,ST79Q10,ST79Q11,ST79Q12,ST79Q15,ST79Q17,ST80Q01,ST80Q04,ST80Q05,ST80Q06,ST80Q07,ST80Q08,ST80Q09,ST80Q10,ST80Q11,ST81Q01,ST81Q02,ST81Q03,ST81Q04,ST81Q05,ST82Q01,ST82Q02,ST82Q03,ST83Q01,ST83Q02,ST83Q03,ST83Q04,ST84Q01,ST84Q02,ST84Q03,ST85Q01,ST85Q02,ST85Q03,ST85Q04,ST86Q01,ST86Q02,ST86Q03,ST86Q04,ST86Q05,ST87Q01,ST87Q02,ST87Q03,ST87Q04,ST87Q05,ST87Q06,ST87Q07,ST87Q08,ST87Q09,ST88Q01,ST88Q02,ST88Q03,ST88Q04,ST89Q02,ST89Q03,ST89Q04,ST89Q05,ST91Q01,ST91Q02,ST91Q03,ST91Q04,ST91Q05,ST91Q06,ST93Q01,ST93Q03,ST93Q04,ST93Q06,ST93Q07,ST94Q05,ST94Q06,ST94Q09,ST94Q10,ST94Q14,ST96Q01,ST96Q02,ST96Q03,ST96Q05,ST101Q01,ST101Q02,ST101Q03,ST101Q05,ST104Q01,ST104Q04,ST104Q05,ST104Q06,IC01Q01,IC01Q02,IC01Q03,IC01Q04,IC01Q05,IC01Q06,IC01Q07,IC01Q08,IC01Q09,IC01Q10,IC01Q11,IC02Q01,IC02Q02,IC02Q03,IC02Q04,IC02Q05,IC02Q06,IC02Q07,IC03Q01,IC04Q01,IC05Q01,IC06Q01,IC07Q01,IC08Q01,IC08Q02,IC08Q03,IC08Q04,IC08Q05,IC08Q06,IC08Q07,IC08Q08,IC08Q09,IC08Q11,IC09Q01,...,IC09Q06,IC09Q07,IC10Q01,IC10Q02,IC10Q03,IC10Q04,IC10Q05,IC10Q06,IC10Q07,IC10Q08,IC10Q09,IC11Q01,IC11Q02,IC11Q03,IC11Q04,IC11Q05,IC11Q06,IC11Q07,IC22Q01,IC22Q02,IC22Q04,IC22Q06,IC22Q07,IC22Q08,EC01Q01,EC02Q01,EC03Q01,EC03Q02,EC03Q03,EC03Q04,EC03Q05,EC03Q06,EC03Q07,EC03Q08,EC03Q09,EC03Q10,EC04Q01A,EC04Q01B,EC04Q01C,EC04Q02A,EC04Q02B,EC04Q02C,EC04Q03A,EC04Q03B,EC04Q03C,EC04Q04A,EC04Q04B,EC04Q04C,EC04Q05A,EC04Q05B,EC04Q05C,EC04Q06A,EC04Q06B,EC04Q06C,EC05Q01,EC06Q01,EC07Q01,EC07Q02,EC07Q03,EC07Q04,EC07Q05,EC08Q01,EC08Q02,EC08Q03,EC08Q04,EC09Q03,EC10Q01,EC11Q02,EC11Q03,EC12Q01,ST22Q01,ST23Q01,ST23Q02,ST23Q03,ST23Q04,ST23Q05,ST23Q06,ST23Q07,ST23Q08,ST24Q01,ST24Q02,ST24Q03,CLCUSE1,CLCUSE301,CLCUSE302,DEFFORT,QUESTID,BOOKID,EASY,AGE,GRADE,PROGN,ANXMAT,ATSCHL,ATTLNACT,BELONG,BFMJ2,BMMJ1,CLSMAN,COBN_F,COBN_M,COBN_S,COGACT,CULTDIST,CULTPOS,DISCLIMA,ENTUSE,ESCS,EXAPPLM,EXPUREM,FAILMAT,FAMCON,FAMCONC,FAMSTRUC,FISCED,HEDRES,HERITCUL,HISCED,HISEI,HOMEPOS,HOMSCH,HOSTCUL,ICTATTNEG,ICTATTPOS,ICTHOME,ICTRES,ICTSCH,IMMIG,INFOCAR,INFOJOB1,INFOJOB2,INSTMOT,INTMAT,ISCEDD,ISCEDL,ISCEDO,LANGCOMM,LANGN,LANGRPPD,LMINS,MATBEH,MATHEFF,MATINTFC,MATWKETH,MISCED,MMINS,MTSUP,OCOD1,OCOD2,OPENPS,OUTHOURS,PARED,PERSEV,REPEAT,SCMAT,SMINS,STUDREL,SUBNORM,TCHBEHFA,TCHBEHSO,TCHBEHTD,TEACHSUP,TESTLANG,TIMEINT,USEMATH,USESCH,WEALTH,ANCATSCHL,ANCATTLNACT,ANCBELONG,ANCCLSMAN,ANCCOGACT,ANCINSTMOT,ANCINTMAT,ANCMATWKETH,ANCMTSUP,ANCSCMAT,ANCSTUDREL,ANCSUBNORM,PV1MATH,PV2MATH,PV3MATH,PV4MATH,PV5MATH,PV1MACC,PV2MACC,PV3MACC,PV4MACC,PV5MACC,PV1MACQ,PV2MACQ,PV3MACQ,PV4MACQ,PV5MACQ,PV1MACS,PV2MACS,PV3MACS,PV4MACS,PV5MACS,PV1MACU,PV2MACU,PV3MACU,PV4MACU,PV5MACU,PV1MAPE,PV2MAPE,PV3MAPE,PV4MAPE,PV5MAPE,PV1MAPF,PV2MAPF,PV3MAPF,PV4MAPF,PV5MAPF,PV1MAPI,PV2MAPI,PV3MAPI,PV4MAPI,PV5MAPI,PV1READ,PV2READ,PV3READ,PV4READ,PV5READ,PV1SCIE,PV2SCIE,PV3SCIE,PV4SCIE,PV5SCIE,W_FSTUWT,W_FSTR1,W_FSTR2,W_FSTR3,W_FSTR4,W_FSTR5,W_FSTR6,W_FSTR7,W_FSTR8,W_FSTR9,W_FSTR10,W_FSTR11,W_FSTR12,W_FSTR13,W_FSTR14,W_FSTR15,W_FSTR16,W_FSTR17,W_FSTR18,W_FSTR19,W_FSTR20,W_FSTR21,W_FSTR22,W_FSTR23,W_FSTR24,W_FSTR25,W_FSTR26,W_FSTR27,W_FSTR28,W_FSTR29,W_FSTR30,W_FSTR31,W_FSTR32,W_FSTR33,W_FSTR34,W_FSTR35,W_FSTR36,W_FSTR37,W_FSTR38,W_FSTR39,W_FSTR40,W_FSTR41,W_FSTR42,W_FSTR43,W_FSTR44,W_FSTR45,W_FSTR46,W_FSTR47,W_FSTR48,W_FSTR49,W_FSTR50,W_FSTR51,W_FSTR52,W_FSTR53,W_FSTR54,W_FSTR55,W_FSTR56,W_FSTR57,W_FSTR58,W_FSTR59,W_FSTR60,W_FSTR61,W_FSTR62,W_FSTR63,W_FSTR64,W_FSTR65,W_FSTR66,W_FSTR67,W_FSTR68,W_FSTR69,W_FSTR70,W_FSTR71,W_FSTR72,W_FSTR73,W_FSTR74,W_FSTR75,W_FSTR76,W_FSTR77,W_FSTR78,W_FSTR79,W_FSTR80,WVARSTRR,VAR_UNIT,SENWGT_STU,VER_STU,math_score,reading_score,science_score,ave_score
0,1,Albania,80000,ALB0006,Non-OECD,Albania,1,1,10,1.0,2,1996,Female,No,6.0,"No, never","No, never","No, never",,,1.0,Yes,Yes,Yes,Yes,,,<ISCED level 3A>,No,No,No,No,"Other (e.g. home duties, retired)",<ISCED level 3A>,,,,,Working part-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,No,Yes,No,No,No,No,Yes,No,Yes,No,Yes,No,Yes,8002,8001,8002,Two,One,,,,0-10 books,Agree,Strongly agree,Agree,Agree,Agree,Agree,Agree,Strongly agree,Disagree,Agree,Disagree,Agree,Agree,Agree,Not at all confident,Not very confident,Confident,Confident,Confident,Not at all confident,Confident,Very confident,Agree,Disagree,Agree,Agree,Agree,Agree,Agree,Disagree,Disagree,Disagree,Agree,Disagree,Disagree,Agree,,Disagree,Likely,Slightly likely,Likely,Likely,Likely,Very Likely,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Courses after school Test Language,Major in college Science,Study harder Test Language,Maximum classes Science,Pursuing a career Math,Often,Sometimes,Sometimes,Sometimes,Sometimes,Never or rarely,Never or rarely,Never or rarely,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Never or Hardly Ever,Most Lessons,Never or Hardly Ever,Every Lesson,Most Lessons,Every Lesson,Every Lesson,Every Lesson,Never or Hardly Ever,Most Lessons,Every Lesson,Every Lesson,Every Lesson,Always or almost always,Sometimes,Never or rarely,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Often,Often,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Agree,Agree,Agree,Strongly agree,Strongly agree,Disagree,Agree,Strongly disagree,Disagree,Agree,Agree,Strongly disagree,Agree,Agree,Disagree,Agree,Agree,Strongly disagree,Strongly agree,Strongly agree,Strongly disagree,Agree,Strongly disagree,Agree,Agree,Strongly agree,Strongly disagree,Strongly disagree,Agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly disagree,Disagree,Strongly disagree,Very much like me,Very much like me,Very much like me,Somewhat like me,Very much like me,Somewhat like me,Mostly like me,Mostly like me,Mostly like me,Somewhat like me,definitely do this,definitely do this,definitely do this,definitely do this,4.0,2.0,1.0,1.0,1.0,2.0,1.0,1.0,,,,,,,,,,,,,,,,,,,,,99,99,99,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,A Simple calculator,99,99,99,StQ Form B,booklet 7,Standard set of booklets,16.17,0.0,Albania: Upper secondary education,0.32,-2.31,0.5206,-1.18,76.49,79.74,-1.3771,Albania,Albania,Albania,0.6994,,-0.48,1.85,,,,,0.64,,,2.0,"ISCED 3A, ISCED 4",-1.29,,"ISCED 3A, ISCED 4",,-2.61,,,,,,-3.16,,Native,,,,0.8,0.91,A,ISCED level 3,General,,Albanian,,,0.6426,-0.77,-0.7332,0.2882,"ISCED 3A, ISCED 4",,-0.9508,Building architects,Primary school teachers,0.0521,,12.0,-0.3407,Did not repeat a <grade>,0.41,,-1.04,-0.0455,1.3625,0.9374,0.4297,1.68,Albanian,,,,-2.92,-1.8636,-0.6779,-0.7351,-0.7808,-0.0219,-0.1562,0.0486,-0.2199,-0.5983,-0.0807,-0.5901,-0.3346,406.8469,376.4683,344.5319,321.1637,381.9209,325.8374,324.2795,279.88,267.417,312.5954,409.1837,388.1524,373.3525,389.7102,415.4152,351.5423,375.6894,341.4161,386.5945,426.3203,396.7207,334.4057,328.9531,339.8582,354.658,324.2795,345.3108,381.1419,380.363,346.8687,319.6059,345.3108,360.8895,390.4892,322.7216,290.7852,345.3108,326.6163,407.6258,367.121,249.5762,254.342,406.8496,175.7053,218.5981,341.7009,408.84,348.2283,367.8105,392.9877,8.9096,13.1249,13.0829,4.5315,13.0829,13.9235,13.1249,13.1249,4.3389,4.3313,13.7954,4.5315,4.3313,13.7954,13.9235,4.3389,4.3313,4.5084,4.5084,13.7954,4.5315,13.1249,13.0829,4.5315,13.0829,13.9235,13.1249,13.1249,4.3389,4.3313,13.7954,4.5315,4.3313,13.7954,13.9235,4.3389,4.3313,4.5084,4.5084,13.7954,4.5315,4.5084,4.5315,13.0829,4.5315,4.3313,4.5084,4.5084,13.7954,13.9235,4.3389,13.0829,13.9235,4.3389,4.3313,13.7954,13.9235,13.1249,13.1249,4.3389,13.0829,4.5084,4.5315,13.0829,4.5315,4.3313,4.5084,4.5084,13.7954,13.9235,4.3389,13.0829,13.9235,4.3389,4.3313,13.7954,13.9235,13.1249,13.1249,4.3389,13.0829,19,1,0.2098,22NOV13,366.18634,261.01424,371.91348,333.03802


In [15]:
#We can display all 636 columns this way
pd.set_option('display.max_columns', 636)

In [33]:
df.head(1)

Unnamed: 0.1,Unnamed: 0,CNT,SUBNATIO,STRATUM,OECD,NC,SCHOOLID,STIDSTD,ST01Q01,ST02Q01,ST03Q01,ST03Q02,ST04Q01,ST05Q01,ST06Q01,ST07Q01,ST07Q02,ST07Q03,ST08Q01,ST09Q01,ST115Q01,ST11Q01,ST11Q02,ST11Q03,ST11Q04,ST11Q05,ST11Q06,ST13Q01,ST14Q01,ST14Q02,ST14Q03,ST14Q04,ST15Q01,ST17Q01,ST18Q01,ST18Q02,ST18Q03,ST18Q04,ST19Q01,ST20Q01,ST20Q02,ST20Q03,ST21Q01,ST25Q01,ST26Q01,ST26Q02,ST26Q03,ST26Q04,ST26Q05,ST26Q06,ST26Q07,ST26Q08,ST26Q09,ST26Q10,ST26Q11,ST26Q12,ST26Q13,ST26Q14,ST26Q15,ST26Q16,ST26Q17,ST27Q01,ST27Q02,ST27Q03,ST27Q04,ST27Q05,ST28Q01,ST29Q01,ST29Q02,ST29Q03,ST29Q04,ST29Q05,ST29Q06,ST29Q07,ST29Q08,ST35Q01,ST35Q02,ST35Q03,ST35Q04,ST35Q05,ST35Q06,ST37Q01,ST37Q02,ST37Q03,ST37Q04,ST37Q05,ST37Q06,ST37Q07,ST37Q08,ST42Q01,ST42Q02,ST42Q03,ST42Q04,ST42Q05,ST42Q06,ST42Q07,ST42Q08,ST42Q09,ST42Q10,ST43Q01,ST43Q02,ST43Q03,ST43Q04,ST43Q05,ST43Q06,ST44Q01,ST44Q03,ST44Q04,ST44Q05,ST44Q07,ST44Q08,ST46Q01,ST46Q02,ST46Q03,ST46Q04,ST46Q05,ST46Q06,ST46Q07,ST46Q08,ST46Q09,ST48Q01,ST48Q02,ST48Q03,ST48Q04,ST48Q05,ST49Q01,ST49Q02,ST49Q03,ST49Q04,ST49Q05,ST49Q06,ST49Q07,ST49Q09,ST53Q01,ST53Q02,ST53Q03,ST53Q04,ST55Q01,ST55Q02,ST55Q03,ST55Q04,ST57Q01,ST57Q02,ST57Q03,ST57Q04,ST57Q05,ST57Q06,ST61Q01,ST61Q02,ST61Q03,ST61Q04,ST61Q05,ST61Q06,ST61Q07,ST61Q08,ST61Q09,ST62Q01,ST62Q02,ST62Q03,ST62Q04,ST62Q06,ST62Q07,ST62Q08,ST62Q09,ST62Q10,ST62Q11,ST62Q12,ST62Q13,ST62Q15,ST62Q16,ST62Q17,ST62Q19,ST69Q01,ST69Q02,ST69Q03,ST70Q01,ST70Q02,ST70Q03,ST71Q01,ST72Q01,ST73Q01,ST73Q02,ST74Q01,ST74Q02,ST75Q01,ST75Q02,ST76Q01,ST76Q02,ST77Q01,ST77Q02,ST77Q04,ST77Q05,ST77Q06,ST79Q01,ST79Q02,ST79Q03,ST79Q04,ST79Q05,ST79Q06,ST79Q07,ST79Q08,ST79Q10,ST79Q11,ST79Q12,ST79Q15,ST79Q17,ST80Q01,ST80Q04,ST80Q05,ST80Q06,ST80Q07,ST80Q08,ST80Q09,ST80Q10,ST80Q11,ST81Q01,ST81Q02,ST81Q03,ST81Q04,ST81Q05,ST82Q01,ST82Q02,ST82Q03,ST83Q01,ST83Q02,ST83Q03,ST83Q04,ST84Q01,ST84Q02,ST84Q03,ST85Q01,ST85Q02,ST85Q03,ST85Q04,ST86Q01,ST86Q02,ST86Q03,ST86Q04,ST86Q05,ST87Q01,ST87Q02,ST87Q03,ST87Q04,ST87Q05,ST87Q06,ST87Q07,ST87Q08,ST87Q09,ST88Q01,ST88Q02,ST88Q03,ST88Q04,ST89Q02,ST89Q03,ST89Q04,ST89Q05,ST91Q01,ST91Q02,ST91Q03,ST91Q04,ST91Q05,ST91Q06,ST93Q01,ST93Q03,ST93Q04,ST93Q06,ST93Q07,ST94Q05,ST94Q06,ST94Q09,ST94Q10,ST94Q14,ST96Q01,ST96Q02,ST96Q03,ST96Q05,ST101Q01,ST101Q02,ST101Q03,ST101Q05,ST104Q01,ST104Q04,ST104Q05,ST104Q06,IC01Q01,IC01Q02,IC01Q03,IC01Q04,IC01Q05,IC01Q06,IC01Q07,IC01Q08,IC01Q09,IC01Q10,IC01Q11,IC02Q01,IC02Q02,IC02Q03,IC02Q04,IC02Q05,IC02Q06,IC02Q07,IC03Q01,IC04Q01,IC05Q01,IC06Q01,IC07Q01,IC08Q01,IC08Q02,IC08Q03,IC08Q04,IC08Q05,IC08Q06,IC08Q07,IC08Q08,IC08Q09,IC08Q11,IC09Q01,...,IC09Q06,IC09Q07,IC10Q01,IC10Q02,IC10Q03,IC10Q04,IC10Q05,IC10Q06,IC10Q07,IC10Q08,IC10Q09,IC11Q01,IC11Q02,IC11Q03,IC11Q04,IC11Q05,IC11Q06,IC11Q07,IC22Q01,IC22Q02,IC22Q04,IC22Q06,IC22Q07,IC22Q08,EC01Q01,EC02Q01,EC03Q01,EC03Q02,EC03Q03,EC03Q04,EC03Q05,EC03Q06,EC03Q07,EC03Q08,EC03Q09,EC03Q10,EC04Q01A,EC04Q01B,EC04Q01C,EC04Q02A,EC04Q02B,EC04Q02C,EC04Q03A,EC04Q03B,EC04Q03C,EC04Q04A,EC04Q04B,EC04Q04C,EC04Q05A,EC04Q05B,EC04Q05C,EC04Q06A,EC04Q06B,EC04Q06C,EC05Q01,EC06Q01,EC07Q01,EC07Q02,EC07Q03,EC07Q04,EC07Q05,EC08Q01,EC08Q02,EC08Q03,EC08Q04,EC09Q03,EC10Q01,EC11Q02,EC11Q03,EC12Q01,ST22Q01,ST23Q01,ST23Q02,ST23Q03,ST23Q04,ST23Q05,ST23Q06,ST23Q07,ST23Q08,ST24Q01,ST24Q02,ST24Q03,CLCUSE1,CLCUSE301,CLCUSE302,DEFFORT,QUESTID,BOOKID,EASY,AGE,GRADE,PROGN,ANXMAT,ATSCHL,ATTLNACT,BELONG,BFMJ2,BMMJ1,CLSMAN,COBN_F,COBN_M,COBN_S,COGACT,CULTDIST,CULTPOS,DISCLIMA,ENTUSE,ESCS,EXAPPLM,EXPUREM,FAILMAT,FAMCON,FAMCONC,FAMSTRUC,FISCED,HEDRES,HERITCUL,HISCED,HISEI,HOMEPOS,HOMSCH,HOSTCUL,ICTATTNEG,ICTATTPOS,ICTHOME,ICTRES,ICTSCH,IMMIG,INFOCAR,INFOJOB1,INFOJOB2,INSTMOT,INTMAT,ISCEDD,ISCEDL,ISCEDO,LANGCOMM,LANGN,LANGRPPD,LMINS,MATBEH,MATHEFF,MATINTFC,MATWKETH,MISCED,MMINS,MTSUP,OCOD1,OCOD2,OPENPS,OUTHOURS,PARED,PERSEV,REPEAT,SCMAT,SMINS,STUDREL,SUBNORM,TCHBEHFA,TCHBEHSO,TCHBEHTD,TEACHSUP,TESTLANG,TIMEINT,USEMATH,USESCH,WEALTH,ANCATSCHL,ANCATTLNACT,ANCBELONG,ANCCLSMAN,ANCCOGACT,ANCINSTMOT,ANCINTMAT,ANCMATWKETH,ANCMTSUP,ANCSCMAT,ANCSTUDREL,ANCSUBNORM,PV1MATH,PV2MATH,PV3MATH,PV4MATH,PV5MATH,PV1MACC,PV2MACC,PV3MACC,PV4MACC,PV5MACC,PV1MACQ,PV2MACQ,PV3MACQ,PV4MACQ,PV5MACQ,PV1MACS,PV2MACS,PV3MACS,PV4MACS,PV5MACS,PV1MACU,PV2MACU,PV3MACU,PV4MACU,PV5MACU,PV1MAPE,PV2MAPE,PV3MAPE,PV4MAPE,PV5MAPE,PV1MAPF,PV2MAPF,PV3MAPF,PV4MAPF,PV5MAPF,PV1MAPI,PV2MAPI,PV3MAPI,PV4MAPI,PV5MAPI,PV1READ,PV2READ,PV3READ,PV4READ,PV5READ,PV1SCIE,PV2SCIE,PV3SCIE,PV4SCIE,PV5SCIE,W_FSTUWT,W_FSTR1,W_FSTR2,W_FSTR3,W_FSTR4,W_FSTR5,W_FSTR6,W_FSTR7,W_FSTR8,W_FSTR9,W_FSTR10,W_FSTR11,W_FSTR12,W_FSTR13,W_FSTR14,W_FSTR15,W_FSTR16,W_FSTR17,W_FSTR18,W_FSTR19,W_FSTR20,W_FSTR21,W_FSTR22,W_FSTR23,W_FSTR24,W_FSTR25,W_FSTR26,W_FSTR27,W_FSTR28,W_FSTR29,W_FSTR30,W_FSTR31,W_FSTR32,W_FSTR33,W_FSTR34,W_FSTR35,W_FSTR36,W_FSTR37,W_FSTR38,W_FSTR39,W_FSTR40,W_FSTR41,W_FSTR42,W_FSTR43,W_FSTR44,W_FSTR45,W_FSTR46,W_FSTR47,W_FSTR48,W_FSTR49,W_FSTR50,W_FSTR51,W_FSTR52,W_FSTR53,W_FSTR54,W_FSTR55,W_FSTR56,W_FSTR57,W_FSTR58,W_FSTR59,W_FSTR60,W_FSTR61,W_FSTR62,W_FSTR63,W_FSTR64,W_FSTR65,W_FSTR66,W_FSTR67,W_FSTR68,W_FSTR69,W_FSTR70,W_FSTR71,W_FSTR72,W_FSTR73,W_FSTR74,W_FSTR75,W_FSTR76,W_FSTR77,W_FSTR78,W_FSTR79,W_FSTR80,WVARSTRR,VAR_UNIT,SENWGT_STU,VER_STU,math_score,reading_score,science_score,ave_score
0,1,Albania,80000,ALB0006,Non-OECD,Albania,1,1,10,1.0,2,1996,Female,No,6.0,"No, never","No, never","No, never",,,1.0,Yes,Yes,Yes,Yes,,,<ISCED level 3A>,No,No,No,No,"Other (e.g. home duties, retired)",<ISCED level 3A>,,,,,Working part-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,No,Yes,No,No,No,No,Yes,No,Yes,No,Yes,No,Yes,8002,8001,8002,Two,One,,,,0-10 books,Agree,Strongly agree,Agree,Agree,Agree,Agree,Agree,Strongly agree,Disagree,Agree,Disagree,Agree,Agree,Agree,Not at all confident,Not very confident,Confident,Confident,Confident,Not at all confident,Confident,Very confident,Agree,Disagree,Agree,Agree,Agree,Agree,Agree,Disagree,Disagree,Disagree,Agree,Disagree,Disagree,Agree,,Disagree,Likely,Slightly likely,Likely,Likely,Likely,Very Likely,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Courses after school Test Language,Major in college Science,Study harder Test Language,Maximum classes Science,Pursuing a career Math,Often,Sometimes,Sometimes,Sometimes,Sometimes,Never or rarely,Never or rarely,Never or rarely,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Never or Hardly Ever,Most Lessons,Never or Hardly Ever,Every Lesson,Most Lessons,Every Lesson,Every Lesson,Every Lesson,Never or Hardly Ever,Most Lessons,Every Lesson,Every Lesson,Every Lesson,Always or almost always,Sometimes,Never or rarely,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Often,Often,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Agree,Agree,Agree,Strongly agree,Strongly agree,Disagree,Agree,Strongly disagree,Disagree,Agree,Agree,Strongly disagree,Agree,Agree,Disagree,Agree,Agree,Strongly disagree,Strongly agree,Strongly agree,Strongly disagree,Agree,Strongly disagree,Agree,Agree,Strongly agree,Strongly disagree,Strongly disagree,Agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly disagree,Disagree,Strongly disagree,Very much like me,Very much like me,Very much like me,Somewhat like me,Very much like me,Somewhat like me,Mostly like me,Mostly like me,Mostly like me,Somewhat like me,definitely do this,definitely do this,definitely do this,definitely do this,4.0,2.0,1.0,1.0,1.0,2.0,1.0,1.0,,,,,,,,,,,,,,,,,,,,,99,99,99,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,A Simple calculator,99,99,99,StQ Form B,booklet 7,Standard set of booklets,16.17,0.0,Albania: Upper secondary education,0.32,-2.31,0.5206,-1.18,76.49,79.74,-1.3771,Albania,Albania,Albania,0.6994,,-0.48,1.85,,,,,0.64,,,2.0,"ISCED 3A, ISCED 4",-1.29,,"ISCED 3A, ISCED 4",,-2.61,,,,,,-3.16,,Native,,,,0.8,0.91,A,ISCED level 3,General,,Albanian,,,0.6426,-0.77,-0.7332,0.2882,"ISCED 3A, ISCED 4",,-0.9508,Building architects,Primary school teachers,0.0521,,12.0,-0.3407,Did not repeat a <grade>,0.41,,-1.04,-0.0455,1.3625,0.9374,0.4297,1.68,Albanian,,,,-2.92,-1.8636,-0.6779,-0.7351,-0.7808,-0.0219,-0.1562,0.0486,-0.2199,-0.5983,-0.0807,-0.5901,-0.3346,406.8469,376.4683,344.5319,321.1637,381.9209,325.8374,324.2795,279.88,267.417,312.5954,409.1837,388.1524,373.3525,389.7102,415.4152,351.5423,375.6894,341.4161,386.5945,426.3203,396.7207,334.4057,328.9531,339.8582,354.658,324.2795,345.3108,381.1419,380.363,346.8687,319.6059,345.3108,360.8895,390.4892,322.7216,290.7852,345.3108,326.6163,407.6258,367.121,249.5762,254.342,406.8496,175.7053,218.5981,341.7009,408.84,348.2283,367.8105,392.9877,8.9096,13.1249,13.0829,4.5315,13.0829,13.9235,13.1249,13.1249,4.3389,4.3313,13.7954,4.5315,4.3313,13.7954,13.9235,4.3389,4.3313,4.5084,4.5084,13.7954,4.5315,13.1249,13.0829,4.5315,13.0829,13.9235,13.1249,13.1249,4.3389,4.3313,13.7954,4.5315,4.3313,13.7954,13.9235,4.3389,4.3313,4.5084,4.5084,13.7954,4.5315,4.5084,4.5315,13.0829,4.5315,4.3313,4.5084,4.5084,13.7954,13.9235,4.3389,13.0829,13.9235,4.3389,4.3313,13.7954,13.9235,13.1249,13.1249,4.3389,13.0829,4.5084,4.5315,13.0829,4.5315,4.3313,4.5084,4.5084,13.7954,13.9235,4.3389,13.0829,13.9235,4.3389,4.3313,13.7954,13.9235,13.1249,13.1249,4.3389,13.0829,19,1,0.2098,22NOV13,366.18634,261.01424,371.91348,333.03802


In [17]:
df[scores].describe()

Unnamed: 0,PV1MATH
count,485490.0
mean,469.621653
std,103.265391
min,19.7928
25%,395.3186
50%,466.2019
75%,541.0578
max,962.2293


In [18]:
df.sample(5)

Unnamed: 0.1,Unnamed: 0,CNT,SUBNATIO,STRATUM,OECD,NC,SCHOOLID,STIDSTD,ST01Q01,ST02Q01,ST03Q01,ST03Q02,ST04Q01,ST05Q01,ST06Q01,ST07Q01,ST07Q02,ST07Q03,ST08Q01,ST09Q01,ST115Q01,ST11Q01,ST11Q02,ST11Q03,ST11Q04,ST11Q05,ST11Q06,ST13Q01,ST14Q01,ST14Q02,ST14Q03,ST14Q04,ST15Q01,ST17Q01,ST18Q01,ST18Q02,ST18Q03,ST18Q04,ST19Q01,ST20Q01,ST20Q02,ST20Q03,ST21Q01,ST25Q01,ST26Q01,ST26Q02,ST26Q03,ST26Q04,ST26Q05,ST26Q06,ST26Q07,ST26Q08,ST26Q09,ST26Q10,ST26Q11,ST26Q12,ST26Q13,ST26Q14,ST26Q15,ST26Q16,ST26Q17,ST27Q01,ST27Q02,ST27Q03,ST27Q04,ST27Q05,ST28Q01,ST29Q01,ST29Q02,ST29Q03,ST29Q04,ST29Q05,ST29Q06,ST29Q07,ST29Q08,ST35Q01,ST35Q02,ST35Q03,ST35Q04,ST35Q05,ST35Q06,ST37Q01,ST37Q02,ST37Q03,ST37Q04,ST37Q05,ST37Q06,ST37Q07,ST37Q08,ST42Q01,ST42Q02,ST42Q03,ST42Q04,ST42Q05,ST42Q06,ST42Q07,ST42Q08,ST42Q09,ST42Q10,ST43Q01,ST43Q02,ST43Q03,ST43Q04,ST43Q05,ST43Q06,ST44Q01,ST44Q03,ST44Q04,ST44Q05,ST44Q07,ST44Q08,ST46Q01,ST46Q02,ST46Q03,ST46Q04,ST46Q05,ST46Q06,ST46Q07,ST46Q08,ST46Q09,ST48Q01,ST48Q02,ST48Q03,ST48Q04,ST48Q05,ST49Q01,ST49Q02,ST49Q03,ST49Q04,ST49Q05,ST49Q06,ST49Q07,ST49Q09,ST53Q01,ST53Q02,ST53Q03,ST53Q04,ST55Q01,ST55Q02,ST55Q03,ST55Q04,ST57Q01,ST57Q02,ST57Q03,ST57Q04,ST57Q05,ST57Q06,ST61Q01,ST61Q02,ST61Q03,ST61Q04,ST61Q05,ST61Q06,ST61Q07,ST61Q08,ST61Q09,ST62Q01,ST62Q02,ST62Q03,ST62Q04,ST62Q06,ST62Q07,ST62Q08,ST62Q09,ST62Q10,ST62Q11,ST62Q12,ST62Q13,ST62Q15,ST62Q16,ST62Q17,ST62Q19,ST69Q01,ST69Q02,ST69Q03,ST70Q01,ST70Q02,ST70Q03,ST71Q01,ST72Q01,ST73Q01,ST73Q02,ST74Q01,ST74Q02,ST75Q01,ST75Q02,ST76Q01,ST76Q02,ST77Q01,ST77Q02,ST77Q04,ST77Q05,ST77Q06,ST79Q01,ST79Q02,ST79Q03,ST79Q04,ST79Q05,ST79Q06,ST79Q07,ST79Q08,ST79Q10,ST79Q11,ST79Q12,ST79Q15,ST79Q17,ST80Q01,ST80Q04,ST80Q05,ST80Q06,ST80Q07,ST80Q08,ST80Q09,ST80Q10,ST80Q11,ST81Q01,ST81Q02,ST81Q03,ST81Q04,ST81Q05,ST82Q01,ST82Q02,ST82Q03,ST83Q01,ST83Q02,ST83Q03,ST83Q04,ST84Q01,ST84Q02,ST84Q03,ST85Q01,ST85Q02,ST85Q03,ST85Q04,ST86Q01,ST86Q02,ST86Q03,ST86Q04,ST86Q05,ST87Q01,ST87Q02,ST87Q03,ST87Q04,ST87Q05,ST87Q06,ST87Q07,ST87Q08,ST87Q09,ST88Q01,ST88Q02,ST88Q03,ST88Q04,ST89Q02,ST89Q03,ST89Q04,ST89Q05,ST91Q01,ST91Q02,ST91Q03,ST91Q04,ST91Q05,ST91Q06,ST93Q01,ST93Q03,ST93Q04,ST93Q06,ST93Q07,ST94Q05,ST94Q06,ST94Q09,ST94Q10,ST94Q14,ST96Q01,ST96Q02,ST96Q03,ST96Q05,ST101Q01,ST101Q02,ST101Q03,ST101Q05,ST104Q01,ST104Q04,ST104Q05,ST104Q06,IC01Q01,IC01Q02,IC01Q03,IC01Q04,IC01Q05,IC01Q06,IC01Q07,IC01Q08,IC01Q09,IC01Q10,IC01Q11,IC02Q01,IC02Q02,IC02Q03,IC02Q04,IC02Q05,IC02Q06,IC02Q07,IC03Q01,IC04Q01,IC05Q01,IC06Q01,IC07Q01,IC08Q01,IC08Q02,IC08Q03,IC08Q04,IC08Q05,IC08Q06,IC08Q07,IC08Q08,IC08Q09,IC08Q11,IC09Q01,IC09Q02,IC09Q03,IC09Q04,IC09Q05,IC09Q06,IC09Q07,IC10Q01,IC10Q02,IC10Q03,IC10Q04,IC10Q05,IC10Q06,IC10Q07,IC10Q08,IC10Q09,IC11Q01,IC11Q02,IC11Q03,IC11Q04,IC11Q05,IC11Q06,IC11Q07,IC22Q01,IC22Q02,IC22Q04,IC22Q06,IC22Q07,IC22Q08,EC01Q01,EC02Q01,EC03Q01,EC03Q02,EC03Q03,EC03Q04,EC03Q05,EC03Q06,EC03Q07,EC03Q08,EC03Q09,EC03Q10,EC04Q01A,EC04Q01B,EC04Q01C,EC04Q02A,EC04Q02B,EC04Q02C,EC04Q03A,EC04Q03B,EC04Q03C,EC04Q04A,EC04Q04B,EC04Q04C,EC04Q05A,EC04Q05B,EC04Q05C,EC04Q06A,EC04Q06B,EC04Q06C,EC05Q01,EC06Q01,EC07Q01,EC07Q02,EC07Q03,EC07Q04,EC07Q05,EC08Q01,EC08Q02,EC08Q03,EC08Q04,EC09Q03,EC10Q01,EC11Q02,EC11Q03,EC12Q01,ST22Q01,ST23Q01,ST23Q02,ST23Q03,ST23Q04,ST23Q05,ST23Q06,ST23Q07,ST23Q08,ST24Q01,ST24Q02,ST24Q03,CLCUSE1,CLCUSE301,CLCUSE302,DEFFORT,QUESTID,BOOKID,EASY,AGE,GRADE,PROGN,ANXMAT,ATSCHL,ATTLNACT,BELONG,BFMJ2,BMMJ1,CLSMAN,COBN_F,COBN_M,COBN_S,COGACT,CULTDIST,CULTPOS,DISCLIMA,ENTUSE,ESCS,EXAPPLM,EXPUREM,FAILMAT,FAMCON,FAMCONC,FAMSTRUC,FISCED,HEDRES,HERITCUL,HISCED,HISEI,HOMEPOS,HOMSCH,HOSTCUL,ICTATTNEG,ICTATTPOS,ICTHOME,ICTRES,ICTSCH,IMMIG,INFOCAR,INFOJOB1,INFOJOB2,INSTMOT,INTMAT,ISCEDD,ISCEDL,ISCEDO,LANGCOMM,LANGN,LANGRPPD,LMINS,MATBEH,MATHEFF,MATINTFC,MATWKETH,MISCED,MMINS,MTSUP,OCOD1,OCOD2,OPENPS,OUTHOURS,PARED,PERSEV,REPEAT,SCMAT,SMINS,STUDREL,SUBNORM,TCHBEHFA,TCHBEHSO,TCHBEHTD,TEACHSUP,TESTLANG,TIMEINT,USEMATH,USESCH,WEALTH,ANCATSCHL,ANCATTLNACT,ANCBELONG,ANCCLSMAN,ANCCOGACT,ANCINSTMOT,ANCINTMAT,ANCMATWKETH,ANCMTSUP,ANCSCMAT,ANCSTUDREL,ANCSUBNORM,PV1MATH,PV2MATH,PV3MATH,PV4MATH,PV5MATH,PV1MACC,PV2MACC,PV3MACC,PV4MACC,PV5MACC,PV1MACQ,PV2MACQ,PV3MACQ,PV4MACQ,PV5MACQ,PV1MACS,PV2MACS,PV3MACS,PV4MACS,PV5MACS,PV1MACU,PV2MACU,PV3MACU,PV4MACU,PV5MACU,PV1MAPE,PV2MAPE,PV3MAPE,PV4MAPE,PV5MAPE,PV1MAPF,PV2MAPF,PV3MAPF,PV4MAPF,PV5MAPF,PV1MAPI,PV2MAPI,PV3MAPI,PV4MAPI,PV5MAPI,PV1READ,PV2READ,PV3READ,PV4READ,PV5READ,PV1SCIE,PV2SCIE,PV3SCIE,PV4SCIE,PV5SCIE,W_FSTUWT,W_FSTR1,W_FSTR2,W_FSTR3,W_FSTR4,W_FSTR5,W_FSTR6,W_FSTR7,W_FSTR8,W_FSTR9,W_FSTR10,W_FSTR11,W_FSTR12,W_FSTR13,W_FSTR14,W_FSTR15,W_FSTR16,W_FSTR17,W_FSTR18,W_FSTR19,W_FSTR20,W_FSTR21,W_FSTR22,W_FSTR23,W_FSTR24,W_FSTR25,W_FSTR26,W_FSTR27,W_FSTR28,W_FSTR29,W_FSTR30,W_FSTR31,W_FSTR32,W_FSTR33,W_FSTR34,W_FSTR35,W_FSTR36,W_FSTR37,W_FSTR38,W_FSTR39,W_FSTR40,W_FSTR41,W_FSTR42,W_FSTR43,W_FSTR44,W_FSTR45,W_FSTR46,W_FSTR47,W_FSTR48,W_FSTR49,W_FSTR50,W_FSTR51,W_FSTR52,W_FSTR53,W_FSTR54,W_FSTR55,W_FSTR56,W_FSTR57,W_FSTR58,W_FSTR59,W_FSTR60,W_FSTR61,W_FSTR62,W_FSTR63,W_FSTR64,W_FSTR65,W_FSTR66,W_FSTR67,W_FSTR68,W_FSTR69,W_FSTR70,W_FSTR71,W_FSTR72,W_FSTR73,W_FSTR74,W_FSTR75,W_FSTR76,W_FSTR77,W_FSTR78,W_FSTR79,W_FSTR80,WVARSTRR,VAR_UNIT,SENWGT_STU,VER_STU
200618,200619,United Kingdom,8260000,GBR1103,OECD,United Kingdom (excl.Scotland),462,11502,11,1.0,8,1997,Female,"Yes, for more than one year",5.0,"No, never","No, never","No, never",One or two times,,1.0,Yes,Yes,Yes,Yes,No,No,<ISCED level 3A>,No,Yes,No,No,Working part-time <for pay>,"<ISCED level 3B, 3C>",No,Yes,No,No,Working full-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,Yes,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Yes,Yes,Yes,826101,826101,826102,Three or more,Two,Three or more,Two,Two,201-500 books,,,,,,,,,,,,,,,,,,,,,,,Strongly disagree,Strongly disagree,Strongly disagree,Strongly agree,Disagree,Strongly agree,Strongly agree,Strongly disagree,Strongly agree,Strongly disagree,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,by heart,check memory,learning goals,Repeat examples,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,2.0,0.0,0.0,0.0,0.0,0.0,Frequently,Sometimes,Rarely,Sometimes,Frequently,Sometimes,Frequently,Sometimes,Frequently,Heard of it once or twice,Heard of it once or twice,Heard of it once or twice,Never heard of it,Heard of it often,"Know it well, understand the concept",Heard of it once or twice,Heard of it a few times,Never heard of it,Never heard of it,"Know it well, understand the concept",Never heard of it,Never heard of it,"Know it well, understand the concept",Heard of it a few times,"Know it well, understand the concept",55.0,55.0,55.0,3.0,3.0,7.0,25.0,26.0,Frequently,Sometimes,Frequently,Frequently,Frequently,Sometimes,Sometimes,Sometimes,Every Lesson,Never or Hardly Ever,Most Lessons,Most Lessons,Some Lessons,Some Lessons,Most Lessons,Never or Hardly Ever,Never or Hardly Ever,Most Lessons,Some Lessons,Some Lessons,Some Lessons,Never or Hardly Ever,Some Lessons,Every Lesson,Some Lessons,Some Lessons,Always or almost always,Always or almost always,Often,Often,Often,Often,Often,Sometimes,Sometimes,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Agree,Agree,Disagree,Strongly agree,Disagree,Agree,Agree,Disagree,Strongly agree,Strongly disagree,Strongly agree,Strongly agree,Strongly agree,Strongly disagree,Agree,Agree,Agree,Disagree,Disagree,Disagree,Strongly agree,Agree,Strongly disagree,Agree,Strongly disagree,Agree,Disagree,Agree,Disagree,Strongly disagree,Agree,Agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Agree,Strongly disagree,Agree,Strongly agree,Strongly disagree,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,97,97,97,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,A Scientific calculator,7,9,2,StQ Form C,booklet 3,Standard set of booklets,15.25,0.0,United Kingdom (excl.Scotland): Students study...,-1.68,0.24,1.2115,0.08,26.6,53.15,2.1989,United Kingdom,United Kingdom,United Kingdom,0.3891,,0.25,1.85,,0.71,0.5359,0.7955,,-0.2052,1.34,2.0,"ISCED 5A, 6",1.12,,"ISCED 5A, 6",53.15,0.86,,,,,,1.15,,Native,,,,,,C,ISCED level 3,General,,English,,165.0,,,,,"ISCED 5A, 6",165.0,-0.2395,Physiotherapy technicians and assistants,Structural-metal preparers and erectors,,2.0,16.0,,Did not repeat a <grade>,2.26,385.0,-0.79,,0.5054,-0.5809,-0.8083,-0.47,English,,,,0.51,0.2818,0.4908,0.257,0.9208,0.537,,,,0.1541,1.3157,0.0809,,583.1204,601.036,557.4155,555.8576,602.5939,586.2362,608.0464,617.3937,605.7096,610.3832,542.6157,558.1944,548.8472,551.9629,546.5104,581.5625,643.0986,665.6878,613.499,625.962,582.3415,602.5939,578.4468,572.2153,597.9202,594.8045,599.4781,611.9411,639.9829,584.6783,618.9516,615.0569,618.1726,643.0986,611.1622,579.2257,552.7419,634.5303,639.2039,586.2362,549.746,626.7941,552.9232,517.1793,561.6606,524.8415,618.0903,512.7191,547.2212,559.3435,95.6384,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,48.7021,48.7021,141.6501,48.7021,141.6501,48.7021,141.6501,141.6501,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,48.7021,48.7021,141.6501,48.7021,141.6501,48.7021,141.6501,141.6501,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,48.7021,48.7021,141.6501,48.7021,141.6501,48.7021,141.6501,141.6501,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,141.6501,141.6501,48.7021,48.7021,48.7021,48.7021,141.6501,48.7021,141.6501,48.7021,141.6501,141.6501,141.6501,141.6501,48.7021,48.7021,6,1,0.139,22NOV13
164241,164242,Spain,7241600,ESP1633,OECD,Spain,667,18659,10,1.0,7,1996,Male,"Yes, for more than one year",5.0,"No, never","No, never",,,,1.0,Yes,Yes,,Yes,,,<ISCED level 2>,No,No,No,Yes,,<ISCED level 3A>,No,No,Yes,Yes,Working full-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,Yes,Yes,Yes,No,Yes,No,No,No,No,No,Yes,No,Yes,724001,724002,724001,Three or more,Three or more,Three or more,One,One,101-200 books,Disagree,Disagree,Disagree,Agree,Agree,Agree,Agree,Disagree,Disagree,Agree,Disagree,Agree,Disagree,Disagree,Not very confident,Very confident,Very confident,Very confident,Very confident,Very confident,Very confident,Not very confident,Disagree,Disagree,Disagree,Disagree,Disagree,Disagree,Disagree,Agree,Agree,Agree,Strongly agree,Strongly agree,Disagree,Disagree,Agree,Strongly disagree,Slightly likely,Likely,Slightly likely,Likely,Likely,Likely,Disagree,Disagree,Agree,Agree,Agree,Agree,Disagree,Agree,Disagree,Courses after school Math,Major in college Math,Study harder Math,Maximum classes Science,Pursuing a career Science,Never or rarely,Sometimes,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Most Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Most Lessons,Some Lessons,Some Lessons,Never or Hardly Ever,Some Lessons,Some Lessons,Most Lessons,Some Lessons,Often,Sometimes,Sometimes,Often,Sometimes,,Sometimes,Always or almost always,Often,Most Lessons,Most Lessons,Some Lessons,Most Lessons,Some Lessons,Agree,Disagree,Strongly disagree,Disagree,Disagree,Disagree,Agree,Agree,Disagree,Strongly disagree,Disagree,Disagree,Agree,Disagree,Agree,Disagree,Agree,Agree,Disagree,Disagree,Strongly agree,Agree,Strongly disagree,Agree,Strongly disagree,Agree,Disagree,Agree,Strongly disagree,Disagree,Agree,Agree,Agree,Disagree,Strongly agree,Strongly agree,Disagree,Agree,Disagree,Agree,Agree,Strongly disagree,Somewhat like me,Somewhat like me,Somewhat like me,Somewhat like me,Somewhat like me,Somewhat like me,Somewhat like me,Very much like me,Mostly like me,Mostly like me,probably not do this,probably do this,probably not do this,probably not do this,2.0,2.0,3.0,3.0,2.0,3.0,3.0,2.0,"Yes, and I use it",No,"Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, but I dont use it","Yes, and I use it","Yes, and I use it",No,"Yes, but I dont use it",No,No,"Yes, and I use it","Yes, but I dont use it",No,No,10-12 years old,7-9 years old,1,4,5,Once or twice a month,Once or twice a week,Once or twice a week,Once or twice a month,Every day,Every day,Every day,Almost every day,Almost every day,Once or twice a month,Once or twice a week,Almost every day,Almost every day,Once or twice a month,Once or twice a month,Once or twice a month,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,No,No,No,No,No,No,No,Disagree,Agree,Agree,Disagree,Disagree,Disagree,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,A Simple calculator,8,9,1,StQ Form B,booklet 13,Standard set of booklets,15.83,0.0,Spain: Compulsory Secondary Education,0.06,0.24,-0.3397,0.08,57.99,26.64,-0.7839,Basque Country (ESP),Basque Country (ESP),Basque Country (ESP),-0.0623,,-1.51,-0.71,0.4546,-0.08,,,0.1524,,,2.0,ISCED 5B,-1.29,,ISCED 5B,57.99,-0.5,0.5165,,-0.1489,-1.0568,0.416,0.24,-0.7538,Native,,,,-0.67,0.3,A,ISCED level 2,General,,Spanish,,,-1.0226,0.54,0.1775,-0.4017,"ISCED 3A, ISCED 4",,-1.1894,Personal care workers in health services not e...,Legal secretaries,0.2542,,13.0,-0.3407,Did not repeat a <grade>,-0.06,,-0.79,-0.7176,-0.2859,0.2217,-0.5612,-0.86,Spanish,39.0,-0.7749,-1.6104,-0.11,0.0526,-0.0831,0.0964,-0.237,0.1813,-0.1562,0.1826,-0.0549,-0.3671,0.0917,-0.2217,-0.0718,544.7967,535.4494,573.6174,564.2701,571.2806,565.0491,565.0491,610.2274,603.9959,568.1648,591.5329,632.0377,688.1212,660.0795,625.0273,600.1013,571.2806,670.2056,590.754,571.2806,530.7758,592.3119,597.7644,613.3432,568.9438,569.7227,585.3014,540.1231,585.3014,605.5538,619.5747,593.8698,574.3963,633.5956,662.4163,606.3328,618.7958,528.439,629.7009,629.7009,543.5401,529.9071,545.144,540.3323,549.9557,567.5494,518.1276,575.0093,551.6971,595.5241,2.7288,1.3644,4.0932,4.0932,4.0932,4.0932,1.3644,4.0932,1.3644,4.0932,1.3644,1.3644,1.3644,1.3644,4.0932,4.0932,1.3644,4.0932,4.0932,1.3644,1.3644,1.3644,4.0932,4.0932,4.0932,4.0932,1.3644,4.0932,1.3644,4.0932,1.3644,1.3644,1.3644,1.3644,4.0932,4.0932,1.3644,4.0932,4.0932,1.3644,1.3644,1.3644,4.0932,4.0932,4.0932,4.0932,1.3644,4.0932,1.3644,4.0932,1.3644,1.3644,1.3644,1.3644,4.0932,4.0932,1.3644,4.0932,4.0932,1.3644,1.3644,1.3644,4.0932,4.0932,4.0932,4.0932,1.3644,4.0932,1.3644,4.0932,1.3644,1.3644,1.3644,1.3644,4.0932,4.0932,1.3644,4.0932,4.0932,1.3644,1.3644,18,2,0.0073,22NOV13
309623,309624,Latvia,4280000,LVA0001,Non-OECD,Latvia,172,3562,9,1.0,6,1996,Female,"Yes, for more than one year",7.0,"No, never","No, never",,,,2.0,No,No,No,No,No,No,"<ISCED level 3B, 3C>",No,No,Yes,,Working part-time <for pay>,<ISCED level 3A>,No,No,No,Yes,Working full-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,No,Yes,No,Yes,No,Yes,428001,428001,428001,Three or more,Two,Two,One,One,101-200 books,Strongly disagree,Strongly disagree,Disagree,Disagree,Strongly disagree,Strongly disagree,Disagree,Disagree,Disagree,Disagree,Disagree,Agree,Disagree,Agree,Not very confident,Not very confident,Not very confident,Confident,Confident,Not very confident,Not very confident,Not very confident,Agree,Strongly agree,Strongly agree,Disagree,Disagree,Strongly disagree,Strongly disagree,Agree,Strongly disagree,Agree,Agree,Agree,Strongly disagree,Agree,Agree,Strongly disagree,Likely,Likely,Likely,Likely,Likely,Slightly likely,Disagree,Agree,Disagree,Disagree,Disagree,Disagree,Agree,Disagree,Agree,Courses after school Test Language,Major in college Science,Study harder Test Language,Maximum classes Math,Pursuing a career Science,Sometimes,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Some Lessons,Every Lesson,Every Lesson,Most Lessons,Every Lesson,Some Lessons,Every Lesson,Every Lesson,Some Lessons,Some Lessons,Every Lesson,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Most Lessons,Most Lessons,Never or Hardly Ever,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Sometimes,Often,Often,Always or almost always,Never or Hardly Ever,Never or Hardly Ever,Never or Hardly Ever,Some Lessons,Never or Hardly Ever,Agree,Disagree,Strongly disagree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Disagree,Strongly agree,Strongly disagree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Agree,Strongly agree,Disagree,Strongly agree,Agree,Strongly disagree,Agree,Agree,Disagree,Agree,Disagree,Agree,Agree,Strongly agree,Strongly disagree,Strongly disagree,Agree,Agree,Strongly agree,Strongly agree,Agree,Agree,Agree,Agree,Strongly disagree,Disagree,Agree,Disagree,Not at all like me,Not at all like me,Mostly like me,Somewhat like me,Not much like me,Somewhat like me,Somewhat like me,Mostly like me,Mostly like me,Mostly like me,probably do this,definitely do this,probably not do this,probably do this,1.0,2.0,3.0,2.0,2.0,3.0,2.0,3.0,"Yes, and I use it","Yes, and I use it",No,No,No,"Yes, and I use it",,"Yes, and I use it","Yes, and I use it",No,No,"Yes, but I dont use it","Yes, but I dont use it",No,No,No,No,No,7-9 years old,7-9 years old,1,4,7,Never or hardly ever,Never or hardly ever,Never or hardly ever,Every day,Every day,Every day,Every day,Once or twice a month,Once or twice a month,Once or twice a month,Once or twice a week,Once or twice a month,Once or twice a week,Once or twice a week,Once or twice a week,Once or twice a week,Almost every day,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Once or twice a month,Once or twice a month,Once or twice a month,No,,No,"Yes, but only the teacher demonstrated this",No,"Yes, but only the teacher demonstrated this","Yes, but only the teacher demonstrated this",,,,,,,"No, never","No, never","No, never","No, never","No, never",Yes,"No, never",Yes,Yes,"No, never","No, never","No, never",2.0,1.0,2.0,2.0,1.0,2.0,2.0,1.0,2.0,2.0,1.0,2.0,2.0,1.0,2.0,2.0,1.0,2.0,<test language> or <other official national la...,,,,,,,,,,,"No, never",,"No, never","No, never",,,,,,,,,,,,,,,99,99,99,StQ Form B,booklet 7,Standard set of booklets,15.75,0.0,Latvia: Basic education,0.79,0.77,0.0873,0.08,72.94,55.25,0.3255,Latvia,Latvia,Latvia,1.2684,,0.25,1.19,-0.0919,0.45,,,0.3889,,,,"ISCED 3A, ISCED 4",0.04,,ISCED 5B,72.94,0.09,0.8118,,,,-0.8265,0.24,-1.4611,Native,-0.4138,,,-1.56,-0.95,A,ISCED level 2,General,,Russian,,,-1.0226,-1.06,-0.7332,-0.7235,ISCED 5B,,1.8433,Accounting associate professionals,Administrative and commercial managers,0.0521,,14.0,0.4795,Did not repeat a <grade>,-1.61,,0.45,-0.7176,-0.5945,0.4855,-0.0798,0.34,Russian,81.0,0.4931,-0.1388,-0.35,0.441,0.0309,0.257,0.2299,0.8023,-0.2795,0.0486,0.1817,0.827,-0.2172,0.3536,0.3141,435.6675,422.4256,396.7207,376.4683,404.51,399.8364,391.2681,408.4047,430.215,414.6362,423.2045,406.8469,444.2359,432.5518,431.7729,363.2264,366.3421,402.1732,414.6362,362.4474,383.4787,384.2577,407.6258,404.51,410.7415,428.6571,445.7937,473.0566,395.9417,478.5091,372.5736,400.6154,431.7729,378.8051,449.6884,423.2045,448.9095,480.8459,409.9626,474.6144,595.1011,539.4994,562.5344,518.8473,549.8254,485.304,425.6248,453.5994,446.1395,483.439,3.3385,5.0141,1.6714,5.0017,1.6672,5.0142,1.6715,5.0016,5.0016,5.0017,5.0016,1.6715,1.6672,5.0143,1.6672,1.6714,5.0142,5.0142,1.6714,1.6672,1.6672,5.014,1.6715,5.0015,1.6672,5.0144,1.6713,5.0016,5.0017,5.0017,5.0017,1.6715,1.6672,5.0145,1.6672,1.6715,5.014,5.014,1.6714,1.6672,1.6672,5.0141,1.6714,5.0017,1.6672,5.0142,1.6715,5.0016,5.0016,5.0017,5.0016,1.6715,1.6672,5.0143,1.6672,1.6714,5.0142,5.0142,1.6714,1.6672,1.6672,5.014,1.6715,5.0015,1.6672,5.0144,1.6713,5.0016,5.0017,5.0017,5.0017,1.6715,1.6672,5.0145,1.6672,1.6715,5.014,5.014,1.6714,1.6672,1.6672,79,1,0.2069,22NOV13
43952,43953,Belgium,560000,BEL0225,OECD,Belgium,93,2566,9,18.0,4,1996,Female,"Yes, for more than one year",6.0,"No, never","No, never","No, never",Five or more times,One or two times,2.0,Yes,Yes,,Yes,,,"<ISCED level 3B, 3C>",,,Yes,Yes,Working full-time <for pay>,"<ISCED level 3B, 3C>",,Yes,Yes,Yes,Working full-time <for pay>,Other country,Other country,Other country,15.0,Language of the test,Yes,Yes,Yes,Yes,No,Yes,No,No,No,No,No,Yes,Yes,Yes,56001,56001,56002,Three or more,Three or more,Two,Two,One,11-25 books,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Disagree,Strongly disagree,Strongly disagree,Agree,Strongly disagree,Strongly disagree,Not at all confident,Not at all confident,Not at all confident,Not at all confident,Not at all confident,Not at all confident,Not at all confident,Not at all confident,Agree,Agree,Agree,Agree,Agree,Strongly disagree,Strongly disagree,Agree,Strongly disagree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Likely,Not at all likely,Not at all likely,Likely,Likely,Likely,Strongly disagree,Strongly disagree,Agree,Strongly disagree,Agree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Courses after school Test Language,Major in college Science,Study harder Test Language,Maximum classes Science,Pursuing a career Science,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,Never or rarely,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Most Lessons,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Always or almost always,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Every Lesson,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Strongly disagree,Agree,Agree,Strongly agree,Agree,Agree,Agree,Agree,Disagree,Agree,Agree,Agree,,Disagree,Agree,Agree,Agree,Agree,Strongly disagree,Agree,Agree,Agree,Strongly disagree,Strongly disagree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Agree,Strongly disagree,Not at all like me,Somewhat like me,Not at all like me,Not at all like me,Not at all like me,Somewhat like me,Somewhat like me,Somewhat like me,Not at all like me,Not at all like me,definitely do this,definitely do this,definitely do this,probably do this,1.0,1.0,1.0,1.0,1.0,1.0,4.0,1.0,"Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it","Yes, and I use it",10-12 years old,10-12 years old,5,5,7,Once or twice a month,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,Never or hardly ever,"Yes, students did this","Yes, students did this","Yes, students did this","Yes, students did this","Yes, students did this","Yes, students did this","Yes, students did this",Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,"No, never","No, never","No, never","No, never","No, never",Yes,Yes,Yes,Yes,"No, never","No, never","No, never",2.0,2.0,1.0,1.0,2.0,2.0,2.0,2.0,1.0,2.0,2.0,1.0,1.0,2.0,2.0,2.0,2.0,1.0,<test language> or <other official national la...,,Mostly <heritage language>,,Mostly <heritage language>,Mostly <heritage language>,Mostly <heritage language>,Not applicable,Mostly <heritage language>,Mostly <heritage language>,Not applicable,"No, never",,"No, never","No, never",,No,,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,A Graphics calculator,8,10,2,StQ Form B,booklet 9,Standard set of booklets,16.0,-1.0,Belgium: second and third degrees of vocation...,0.79,0.77,-0.9394,-0.37,18.08,55.25,-0.4499,France,France,France,3.2019,1.5355,-1.51,-2.48,-2.8266,0.38,,,-0.3017,,,2.0,"ISCED 5A, 6",-1.29,1.305,"ISCED 5A, 6",55.25,-0.58,-2.4442,1.1591,2.4083,1.3045,2.7833,-0.4,2.8261,First-Generation,-0.0586,0.4355,-1.4871,-2.3,-1.78,C,ISCED level 3,Vocational,4.0,French,2.0,,-2.1402,-3.75,-1.5329,-1.9945,ISCED 5B,,-2.8645,Accounting associate professionals,Lifting truck operators,-1.3721,,17.0,-1.1279,Did not repeat a <grade>,-1.01,,-0.55,-2.3212,2.6295,3.3108,2.563,0.97,French,120.0,2.8011,-1.6104,0.35,-1.3226,-2.609,-2.6339,-2.8224,-0.5365,-2.2172,-1.6958,-2.743,-2.7492,-2.0134,-2.5726,-3.0277,380.2072,348.2708,337.3656,352.9444,399.6806,385.6598,322.5658,345.155,335.0288,252.4614,342.0393,305.4292,313.2186,284.3979,238.4406,345.9339,282.0611,317.1133,304.6503,218.9671,409.8068,356.8391,352.1654,331.1341,276.6085,316.3343,317.8922,390.3334,352.1654,369.3021,304.6503,352.1654,359.9548,338.9235,374.7546,295.303,292.9662,366.9653,318.6711,365.4074,449.1863,405.4993,449.1863,438.066,459.5123,459.4741,366.2253,423.107,409.1197,448.2842,14.4143,21.0745,21.5934,21.0745,21.0745,21.0745,7.2172,7.4569,21.5934,7.2172,7.4569,21.5934,21.0745,7.2172,7.4569,7.4569,7.4569,21.5934,7.2172,21.5934,7.2172,21.0745,21.5934,21.0745,21.0745,21.0745,7.2172,7.4569,21.5934,7.2172,7.4569,21.5934,21.0745,7.2172,7.4569,7.4569,7.4569,21.5934,7.2172,21.5934,7.2172,7.4569,7.2172,7.4569,7.4569,7.4569,21.5934,21.0745,7.2172,21.5934,21.0745,7.2172,7.4569,21.5934,21.0745,21.0745,21.0745,7.2172,21.5934,7.2172,21.5934,7.4569,7.2172,7.4569,7.4569,7.4569,21.5934,21.0745,7.2172,21.5934,21.0745,7.2172,7.4569,21.5934,21.0745,21.0745,21.0745,7.2172,21.5934,7.2172,21.5934,75,1,0.1223,22NOV13
298864,298865,Lithuania,4400000,LTU0011,Non-OECD,Lithuania,123,2679,9,4.0,7,1996,Female,No,7.0,"No, never","No, never",,,,1.0,Yes,Yes,Yes,No,No,No,<ISCED level 3A>,No,Yes,Yes,Yes,Working full-time <for pay>,<ISCED level 3A>,Yes,Yes,Yes,Yes,Working full-time <for pay>,Country of test,Country of test,Country of test,,Language of the test,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,440001,440001,440001,Three or more,Two,Three or more,Three or more,Three or more,201-500 books,,,,,,,,,,,,,,,,,,,,,,,Agree,Disagree,Agree,Disagree,Disagree,Disagree,Strongly disagree,Disagree,Strongly disagree,Strongly agree,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Most important,Improve understanding,learning goals,more information,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,I do not attend <out-of-school time lessons> i...,2.0,0.0,0.0,0.0,0.0,0.0,Rarely,Sometimes,Frequently,Sometimes,Sometimes,Rarely,Sometimes,Sometimes,Sometimes,Never heard of it,"Know it well, understand the concept",Heard of it often,Never heard of it,Heard of it a few times,Never heard of it,Heard of it once or twice,Never heard of it,"Know it well, understand the concept",Never heard of it,"Know it well, understand the concept",Never heard of it,Never heard of it,Never heard of it,"Know it well, understand the concept",Heard of it once or twice,45.0,45.0,45.0,4.0,4.0,7.0,,27.0,Frequently,Sometimes,Frequently,Frequently,Rarely,Rarely,Sometimes,Sometimes,Most Lessons,Some Lessons,Some Lessons,Some Lessons,Most Lessons,Most Lessons,Some Lessons,Some Lessons,Some Lessons,Some Lessons,Most Lessons,Some Lessons,Most Lessons,Some Lessons,Never or Hardly Ever,Some Lessons,Some Lessons,Some Lessons,Sometimes,Often,Sometimes,Often,Sometimes,Often,Often,Often,Sometimes,Never or Hardly Ever,Never or Hardly Ever,Some Lessons,Some Lessons,Never or Hardly Ever,Strongly agree,Agree,Strongly disagree,Agree,Agree,Agree,Agree,Disagree,Strongly agree,Strongly disagree,Strongly agree,Strongly agree,Agree,Strongly disagree,Agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly disagree,Disagree,Strongly agree,Disagree,Agree,Disagree,Agree,Disagree,Agree,Disagree,Strongly disagree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Strongly agree,Agree,Strongly agree,Agree,Strongly disagree,Strongly disagree,Strongly agree,Strongly disagree,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,97,97,97,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,A Simple calculator,8,10,2,StQ Form C,booklet 5,Standard set of booklets,15.83,0.0,Lithuania: Lower Gymnasium,0.56,1.38,0.5206,-0.37,88.31,88.31,1.2923,Lithuania,Lithuania,Lithuania,-0.1777,,1.27,0.81,,2.06,0.322,-0.6635,,-0.4275,1.16,2.0,"ISCED 5A, 6",1.12,,"ISCED 5A, 6",88.31,2.65,,,,,,1.15,,Native,,,,,,A,ISCED level 2,General,,Lithuanian,,180.0,,,,,"ISCED 5A, 6",180.0,-0.2395,Dentists,Dentists,,2.0,16.0,,Did not repeat a <grade>,-0.76,315.0,1.51,,-0.5945,0.4855,-0.3204,-0.66,Lithuanian,,,,1.9,0.6517,0.1807,0.1459,0.669,0.3769,,,,0.1541,0.0917,0.7959,,523.298,498.372,505.3825,539.6557,544.3293,577.8236,565.3606,515.5086,557.5713,520.1823,535.761,534.2031,489.8037,529.5295,500.7088,563.8028,559.1291,537.3189,570.8132,509.2771,502.2667,475.0039,432.9413,466.4356,438.3938,538.0978,549.7819,490.5826,505.3825,527.9716,540.4346,538.8768,471.8881,519.4033,502.2667,545.8872,598.855,545.8872,518.6244,544.3293,565.7116,542.6766,538.7051,541.088,614.9588,562.4207,529.7837,518.5938,514.8639,539.1085,6.8792,3.3716,3.3716,3.3716,3.3716,10.5277,10.5277,3.3716,10.5277,10.5277,3.3716,3.3716,10.5277,10.5277,10.5277,10.5277,3.3716,10.5277,3.3716,10.5277,3.3716,3.3716,3.3716,3.3716,3.3716,10.5277,10.5277,3.3716,10.5277,10.5277,3.3716,3.3716,10.5277,10.5277,10.5277,10.5277,3.3716,10.5277,3.3716,10.5277,3.3716,10.5277,10.5277,10.5277,10.5277,3.3716,3.3716,10.5277,3.3716,3.3716,10.5277,10.5277,3.3716,3.3716,3.3716,3.3716,10.5277,3.3716,10.5277,3.3716,10.5277,10.5277,10.5277,10.5277,10.5277,3.3716,3.3716,10.5277,3.3716,3.3716,10.5277,10.5277,3.3716,3.3716,3.3716,3.3716,10.5277,3.3716,10.5277,3.3716,10.5277,48,2,0.2082,22NOV13


In [19]:
#create scores column
df['math_score'] = (df['PV1MATH'] + df['PV2MATH'] + df['PV3MATH'] + df['PV4MATH'] + df['PV5MATH']) / 5
df['reading_score'] = (df['PV1READ'] + df['PV2READ'] + df['PV3READ'] + df['PV4READ'] + df['PV5READ']) / 5
df['science_score'] = (df['PV1SCIE'] + df['PV2SCIE'] + df['PV3SCIE'] + df['PV4SCIE'] + df['PV5SCIE']) / 5
#create column with average score from all subjects
df['ave_score']=(df['math_score']+df['science_score']+df['reading_score'])/3

In [20]:
#Create new dataset copy of the columns that we need  
df_clean = df[['CNT','ST04Q01', 'ST08Q01' , 'ST26Q02' ,'ST26Q07', 'ST29Q07', 'ST35Q04','ST26Q04','ST72Q01',
         'ST79Q03','ST79Q05','ST87Q03','ST93Q01','math_score','reading_score','science_score', 'ave_score' ]]

In [21]:
df_clean

Unnamed: 0,CNT,ST04Q01,ST08Q01,ST26Q02,ST26Q07,ST29Q07,ST35Q04,ST26Q04,ST72Q01,ST79Q03,ST79Q05,ST87Q03,ST93Q01,math_score,reading_score,science_score,ave_score
0,Albania,Female,,No,No,Agree,Agree,No,,Never or Hardly Ever,Most Lessons,Strongly disagree,Very much like me,366.18634,261.01424,371.91348,333.038020
1,Albania,Female,One or two times,Yes,Yes,Disagree,Agree,Yes,30.0,,,,Not at all like me,470.56396,384.68832,478.12382,444.458700
2,Albania,Female,,Yes,Yes,Strongly agree,Strongly agree,Yes,30.0,,,,Not much like me,505.53824,405.18154,486.60946,465.776413
3,Albania,Female,,Yes,Yes,,,Yes,28.0,Every Lesson,Every Lesson,,,449.45476,477.46376,453.97240,460.296973
4,Albania,Female,One or two times,Yes,Yes,Strongly agree,Strongly agree,Yes,,Most Lessons,Most Lessons,Strongly agree,,385.50398,256.01010,367.15778,336.223953
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
485485,Vietnam,Female,One or two times,No,No,Agree,Strongly agree,Yes,41.0,,,,Not much like me,486.22058,472.61846,536.31110,498.383380
485486,Vietnam,Male,One or two times,Yes,Yes,Agree,Agree,Yes,,Never or Hardly Ever,Some Lessons,,Somewhat like me,529.21794,487.24356,524.37522,513.612240
485487,Vietnam,Male,,No,No,Agree,Agree,No,,Some Lessons,Some Lessons,Agree,Not much like me,486.29850,476.25694,541.90600,501.487147
485488,Vietnam,Male,,No,No,,,Yes,41.0,Never or Hardly Ever,Some Lessons,Disagree,,522.90856,518.43922,526.70646,522.684747


In [22]:
#We are going to rename all columns
df_clean = df_clean.rename(columns={'CNT' : 'Country', 'ST04Q01' : 'Gender', 'ST08Q01': 'late_for_sch', 'ST26Q02':'possessions_room',
                                    'ST26Q07':'possessions_literature','ST26Q04':'possessions_computer', 'ST29Q07':'math_imprtnt_future_student',
                                    'ST35Q04':'math_imprtnt_future_parents',  'ST72Q01':'class_size', 'ST79Q03':'teach_diff_task_to_diff_stud', 
                                    'ST79Q05':'teach_gives_feedback','ST87Q03':'stud_belong_at_sch', 'ST93Q01':'stud_give_up_easy'})

In [36]:
#Check to see if changes applied
df_clean

Unnamed: 0,Country,Gender,late_for_sch,possessions_room,possessions_literature,math_imprtnt_future_student,math_imprtnt_future_parents,possessions_computer,class_size,teach_diff_task_to_diff_stud,teach_gives_feedback,stud_belong_at_sch,stud_give_up_easy,math_score,reading_score,science_score,ave_score
0,Albania,Female,,No,No,Agree,Agree,No,26.017759,Never or Hardly Ever,Most Lessons,Strongly disagree,Very much like me,366.18634,261.01424,371.91348,333.038020
1,Albania,Female,One or two times,Yes,Yes,Disagree,Agree,Yes,30.000000,,,,Not at all like me,470.56396,384.68832,478.12382,444.458700
2,Albania,Female,,Yes,Yes,Strongly agree,Strongly agree,Yes,30.000000,,,,Not much like me,505.53824,405.18154,486.60946,465.776413
3,Albania,Female,,Yes,Yes,,,Yes,28.000000,Every Lesson,Every Lesson,,,449.45476,477.46376,453.97240,460.296973
4,Albania,Female,One or two times,Yes,Yes,Strongly agree,Strongly agree,Yes,26.017759,Most Lessons,Most Lessons,Strongly agree,,385.50398,256.01010,367.15778,336.223953
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
485485,Vietnam,Female,One or two times,No,No,Agree,Strongly agree,Yes,41.000000,,,,Not much like me,486.22058,472.61846,536.31110,498.383380
485486,Vietnam,Male,One or two times,Yes,Yes,Agree,Agree,Yes,26.017759,Never or Hardly Ever,Some Lessons,,Somewhat like me,529.21794,487.24356,524.37522,513.612240
485487,Vietnam,Male,,No,No,Agree,Agree,No,26.017759,Some Lessons,Some Lessons,Agree,Not much like me,486.29850,476.25694,541.90600,501.487147
485488,Vietnam,Male,,No,No,,,Yes,41.000000,Never or Hardly Ever,Some Lessons,Disagree,,522.90856,518.43922,526.70646,522.684747


In [24]:
df_clean.Country.unique()

array(['Albania', 'United Arab Emirates', 'Argentina', 'Australia',
       'Austria', 'Belgium', 'Bulgaria', 'Brazil', 'Canada',
       'Switzerland', 'Chile', 'Colombia', 'Costa Rica', 'Czech Republic',
       'Germany', 'Denmark', 'Spain', 'Estonia', 'Finland', 'France',
       'United Kingdom', 'Greece', 'Hong Kong-China', 'Croatia',
       'Hungary', 'Indonesia', 'Ireland', 'Iceland', 'Israel', 'Italy',
       'Jordan', 'Japan', 'Kazakhstan', 'Korea', 'Liechtenstein',
       'Lithuania', 'Luxembourg', 'Latvia', 'Macao-China', 'Mexico',
       'Montenegro', 'Malaysia', 'Netherlands', 'Norway', 'New Zealand',
       'Peru', 'Poland', 'Portugal', 'Qatar', 'China-Shanghai',
       'Perm(Russian Federation)', 'Florida (USA)', 'Connecticut (USA)',
       'Massachusetts (USA)', 'Romania', 'Russian Federation',
       'Singapore', 'Serbia', 'Slovak Republic', 'Slovenia', 'Sweden',
       'Chinese Taipei', 'Thailand', 'Tunisia', 'Turkey', 'Uruguay',
       'United States of America', 'Vietn

There are some problems with some Country names like for example Perm(Russian Federation), also we can see that there are different columns for Florida, Connecticut and Massachusetts. Those states need to be placed in USA column.

In [25]:
df_clean.replace('Perm(Russian Federation)', 'Russian Federation', inplace=True)
df_clean.replace('Florida (USA)', 'United States of America', inplace=True)
df_clean.replace('Massachusetts (USA)', 'United States of America', inplace=True)
df_clean.replace('Connecticut (USA)', 'United States of America', inplace=True)

In [26]:
#We check to see if the changes applied
df_clean.Country.unique()

array(['Albania', 'United Arab Emirates', 'Argentina', 'Australia',
       'Austria', 'Belgium', 'Bulgaria', 'Brazil', 'Canada',
       'Switzerland', 'Chile', 'Colombia', 'Costa Rica', 'Czech Republic',
       'Germany', 'Denmark', 'Spain', 'Estonia', 'Finland', 'France',
       'United Kingdom', 'Greece', 'Hong Kong-China', 'Croatia',
       'Hungary', 'Indonesia', 'Ireland', 'Iceland', 'Israel', 'Italy',
       'Jordan', 'Japan', 'Kazakhstan', 'Korea', 'Liechtenstein',
       'Lithuania', 'Luxembourg', 'Latvia', 'Macao-China', 'Mexico',
       'Montenegro', 'Malaysia', 'Netherlands', 'Norway', 'New Zealand',
       'Peru', 'Poland', 'Portugal', 'Qatar', 'China-Shanghai',
       'Russian Federation', 'United States of America', 'Romania',
       'Singapore', 'Serbia', 'Slovak Republic', 'Slovenia', 'Sweden',
       'Chinese Taipei', 'Thailand', 'Tunisia', 'Turkey', 'Uruguay',
       'Vietnam'], dtype=object)

In [27]:
df_clean.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 485490 entries, 0 to 485489
Data columns (total 17 columns):
 #   Column                        Non-Null Count   Dtype  
---  ------                        --------------   -----  
 0   Country                       485490 non-null  object 
 1   Gender                        485490 non-null  object 
 2   late_for_sch                  479143 non-null  object 
 3   possessions_room              469693 non-null  object 
 4   possessions_literature        465860 non-null  object 
 5   math_imprtnt_future_student   315066 non-null  object 
 6   math_imprtnt_future_parents   315160 non-null  object 
 7   possessions_computer          473877 non-null  object 
 8   class_size                    294163 non-null  float64
 9   teach_diff_task_to_diff_stud  313955 non-null  object 
 10  teach_gives_feedback          313637 non-null  object 
 11  stud_belong_at_sch            310821 non-null  object 
 12  stud_give_up_easy             312856 non-nul

In [28]:
#Check some stats about this dataset
df_clean.describe()

Unnamed: 0,class_size,math_score,reading_score,science_score,ave_score
count,294163.0,485490.0,485490.0,485490.0,485490.0
mean,26.017759,469.651234,472.006964,475.808094,472.488764
std,9.223134,100.78661,98.86331,97.99847,96.036271
min,0.0,54.76708,6.4454,25.15854,77.114593
25%,20.0,396.01962,405.0442,405.7628,403.992595
50%,25.0,465.73452,475.47798,475.51286,472.04646
75%,30.0,540.12306,542.831195,546.38192,541.4557
max,200.0,903.10796,849.35974,857.8329,826.592027


Now we need to check the NA values and then replace them with mean. 
<br>We could also drop them but this would make less accurate the analysis.

In [29]:
df_clean.isna().sum()

Country                              0
Gender                               0
late_for_sch                      6347
possessions_room                 15797
possessions_literature           19630
math_imprtnt_future_student     170424
math_imprtnt_future_parents     170330
possessions_computer             11613
class_size                      191327
teach_diff_task_to_diff_stud    171535
teach_gives_feedback            171853
stud_belong_at_sch              174669
stud_give_up_easy               172634
math_score                           0
reading_score                        0
science_score                        0
ave_score                            0
dtype: int64

In [30]:
df_clean.loc[np.isfinite(df_clean['class_size']) == False, 'class_size'] = df_clean['class_size'].mean()

In [31]:
df_clean.isna().sum()

Country                              0
Gender                               0
late_for_sch                      6347
possessions_room                 15797
possessions_literature           19630
math_imprtnt_future_student     170424
math_imprtnt_future_parents     170330
possessions_computer             11613
class_size                           0
teach_diff_task_to_diff_stud    171535
teach_gives_feedback            171853
stud_belong_at_sch              174669
stud_give_up_easy               172634
math_score                           0
reading_score                        0
science_score                        0
ave_score                            0
dtype: int64

In [34]:
# convert into ordered categorical types
ordinal_var_dict = {'late_for_sch': ['None  ', 'One or two times  ', 'Three or four times  ', 'Five or more times  '],
                    'math_imprtnt_future_student': ['Strongly agree', 'Agree',  'Disagree', 'Strongly disagree'],
                   'math_imprtnt_future_parents': ['Strongly agree', 'Agree',  'Disagree', 'Strongly disagree'],
                   'teach_diff_task_to_diff_stud': ['Never or Hardly Ever','Some Lessons','Most Lessons' , 'Every Lesson'],
                   'teach_gives_feedback':['Never or Hardly Ever','Some Lessons','Most Lessons' , 'Every Lesson'],
                   'stud_belong_at_sch' : ['Strongly agree', 'Agree',  'Disagree', 'Strongly disagree'],
                   'stud_give_up_easy' :['Very much like me','Mostly like me','Somewhat like me','Not much like me','Not at all like me' ]}

for var in ordinal_var_dict:
   ordered_var = pd.api.types.CategoricalDtype(ordered = True,
                                               categories = ordinal_var_dict[var])
   df_clean[var] = df_clean[var].astype(ordered_var)

It is important to change the variables into dtype category

In [57]:
df_clean[['Country', 'Gender', 'possessions_room', 'possessions_litterature', 
          'possessions_computer']] = df_clean[['Country', 'Gender', 'possessions_room', 
                                               'possessions_literature', 
                                               'possessions_computer']].astype('category')

The class size should be an integer value

In [58]:
df_clean[['class_size']] = df_clean[['class_size']].astype('int')

In [59]:
df_clean.dtypes

Country                         category
Gender                          category
late_for_sch                    category
possessions_room                category
possessions_literature            object
math_imprtnt_future_student     category
math_imprtnt_future_parents     category
possessions_computer            category
class_size                         int64
teach_diff_task_to_diff_stud    category
teach_gives_feedback            category
stud_belong_at_sch              category
stud_give_up_easy               category
math_score                       float64
reading_score                    float64
science_score                    float64
ave_score                        float64
possessions_litterature         category
dtype: object

In [60]:
sum(df_clean.duplicated())

0

We are ready to save the cleaned dataset in order to proceed with the visualisation and analysis of our data.

In [61]:
df_clean.to_csv('df_clean.csv', index = False, encoding = 'utf-8')

In [63]:
df_clean.shape

(485490, 18)

## Which country have the most assessed students?
As we can see in the chart, Mexico is by far the country with the most assessed students. Italy is the second one with 31073 students, Spain is the 3rd with 25313 students. Canada and Brazil are following with 21544 and 19204 students respectively.
* We see the results in a barchart starting from the country with the most assessments.
* We can also see the results in a pie chart with the country assessment percentages.

In [None]:
base_color = sns.color_palette()[4]

plt.style.use('fivethirtyeight')
plt.figure(figsize=(12,8))
sns.set(style='white')
sns.countplot(data=df_clean, x = 'Country',order=df_clean['Country'].value_counts().index);
sns.despine(bottom = True, left = True)
plt.xlabel('Countries')
plt.xticks( rotation=90, fontsize='12')
plt.ylabel('count')
plt.title('Number of assessed students per country', fontsize=20)
plt.show()