In [5]:
people = {
    'first' : ['Sunny', 'Raunak', 'Siddharth'],
    'last' : ['Tamang', 'Tamang', 'Tamang'],
    'email' : ['sunny@email.com', 'raunak@email.com', 'sid@email.com']
}

In [2]:
import pandas as pd

In [7]:
df = pd.DataFrame(people)

In [8]:
df

Unnamed: 0,first,last,email
0,Sunny,Tamang,sunny@email.com
1,Raunak,Tamang,raunak@email.com
2,Siddharth,Tamang,sid@email.com


### Setting up custom index on email column

In [9]:
df.set_index('email')

Unnamed: 0_level_0,first,last
email,Unnamed: 1_level_1,Unnamed: 2_level_1
sunny@email.com,Sunny,Tamang
raunak@email.com,Raunak,Tamang
sid@email.com,Siddharth,Tamang


In [10]:
df

Unnamed: 0,first,last,email
0,Sunny,Tamang,sunny@email.com
1,Raunak,Tamang,raunak@email.com
2,Siddharth,Tamang,sid@email.com


> When we check the original data frame the changes did not appear as Pandas does not do all the changes in place. Hence to make the changes in the original dataset<br>
> We use below

In [11]:
df.set_index('email', inplace=True)

> Now if we look at the index

In [12]:
df

Unnamed: 0_level_0,first,last
email,Unnamed: 1_level_1,Unnamed: 2_level_1
sunny@email.com,Sunny,Tamang
raunak@email.com,Raunak,Tamang
sid@email.com,Siddharth,Tamang


### Check the index

In [13]:
df.index

Index(['sunny@email.com', 'raunak@email.com', 'sid@email.com'], dtype='object', name='email')

> Since our index has changes now to email we will use the below to selet the rows

In [14]:
df.loc['sunny@email.com']

first     Sunny
last     Tamang
Name: sunny@email.com, dtype: object

In [15]:
# accessing multiple for specific email index
df.loc[['sunny@email.com','sid@email.com']]

Unnamed: 0_level_0,first,last
email,Unnamed: 1_level_1,Unnamed: 2_level_1
sunny@email.com,Sunny,Tamang
sid@email.com,Siddharth,Tamang


> If we need to access the index using integer value we can still use iloc

In [16]:
df.iloc[0]

first     Sunny
last     Tamang
Name: sunny@email.com, dtype: object

### Reset index

In [17]:
df.reset_index(inplace=True)

In [18]:
df

Unnamed: 0,email,first,last
0,sunny@email.com,Sunny,Tamang
1,raunak@email.com,Raunak,Tamang
2,sid@email.com,Siddharth,Tamang


## using the stackoverflow survey data

> We can set the index during the loading of the dataset

In [22]:
df = pd.read_csv('survey_results_public.csv', index_col='ResponseId')
schema_df = pd.read_csv('survey_results_schema.csv')

In [24]:
df.head()

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,...,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,...,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,...,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,...,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,...,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,...,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


In [27]:
## Select the first response ID
df.loc[1]

MainBranch                                         I am a developer by profession
Employment                      Independent contractor, freelancer, or self-em...
Country                                                                  Slovakia
US_State                                                                      NaN
UK_Country                                                                    NaN
EdLevel                         Secondary school (e.g. American high school, G...
Age1stCode                                                          18 - 24 years
LearnCode                       Coding Bootcamp;Other online resources (ex: vi...
YearsCode                                                                     NaN
YearsCodePro                                                                  NaN
DevType                                                         Developer, mobile
OrgSize                                                        20 to 99 employees
Currency        

### Sort using the index

In [31]:
schema_df = pd.read_csv('survey_results_schema.csv', index_col='qname')

schema_df.sort_index()

Unnamed: 0_level_0,qid,question,force_resp,type,selector
qname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Accessibility,QID124,"Which of the following describe you, if any? P...",False,MC,MAVR
Age,QID127,What is your age?,False,MC,MAVR
Age1stCode,QID149,At what age did you write your first line of c...,False,MC,MAVR
CompFreq,QID52,"Is that compensation weekly, monthly, or yearly?",False,MC,MAVR
CompTotal,QID51,What is your current total compensation (salar...,False,TE,SL
Country,QID6,"Where do you live? <span style=""font-weight: b...",True,MC,DL
Currency,QID50,Which currency do you use day-to-day? If your ...,True,MC,SB
Database,QID262,Which <b>database environments </b>have you do...,False,Matrix,Likert
DevType,QID31,Which of the following describes your current ...,False,MC,MAVR
EdLevel,QID25,Which of the following best describes the high...,False,MC,SAVR
