Topic : Pandas (python) Tips and Tricks (part 2)\
Written by : Abdul Rehman\
Date: 27.08.2025\
Email: oyeeemani47@gmail.com

[6- reverse row order](#6--reverse-row-order)\
[7- reverse column order](#7--reverse-column-order)\
[8- select a column by data type](#8--select-a-column-by-data-type)\
[9- convert a string to number](#9--convert-strings-to-number)\
[10- reduce data frame size](#10--reduce-data-frame-size)\
[11- copy from clipboard](#11--copy-from-clipboard)\
[12- split dataframr into two subsets](#12--split-dataframe-into-two-subsets)\
[13- join two sets](#13--join-two-sets)\
[14- filtering a data set](#14--filtering-a-data-set)\
[15- filtering by large category](#15--filtering-by-large-category)\
[16- splitting a string into multiple columns](#16--splitting-a-string-into-multiple-columns)\
[17- aggregate by multiple groups or columns or function](#17--aggregate-by-multiple-groups-or-function)\
[18- select specific rows and columns](#18--select-specific-rows-and-columns)\
[19- reshape multiindex series](#19--reshape-multiindex-series)\
[20- continuous dara to catagorical data conversion](#20--continuous-data-to-catagorical-data-conversion)\
[21- convert one set of values in another](#21--convert-one-set-of-values-in-another)\
[22- transpose a wide dataframe](#22--transpose-a-wide-dataframe)\
[23- reshaping a data frame](#23--reshaping-a-data-frame)\

# 6- Reverse row order

In [2]:
import seaborn as sns
import pandas as pd

df=sns.load_dataset("titanic")
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [3]:
df.loc[::-1].head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
890,0,3,male,32.0,0,0,7.75,Q,Third,man,True,,Queenstown,no,True
889,1,1,male,26.0,0,0,30.0,C,First,man,True,C,Cherbourg,yes,True
888,0,3,female,,1,2,23.45,S,Third,woman,False,,Southampton,no,False
887,1,1,female,19.0,0,0,30.0,S,First,woman,False,B,Southampton,yes,True
886,0,2,male,27.0,0,0,13.0,S,Second,man,True,,Southampton,no,True


In [4]:
df.loc[::-1].reset_index(drop=True).head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,32.0,0,0,7.75,Q,Third,man,True,,Queenstown,no,True
1,1,1,male,26.0,0,0,30.0,C,First,man,True,C,Cherbourg,yes,True
2,0,3,female,,1,2,23.45,S,Third,woman,False,,Southampton,no,False
3,1,1,female,19.0,0,0,30.0,S,First,woman,False,B,Southampton,yes,True
4,0,2,male,27.0,0,0,13.0,S,Second,man,True,,Southampton,no,True


#  7- Reverse column order

In [5]:
df.loc[:,::-1].head()

Unnamed: 0,alone,alive,embark_town,deck,adult_male,who,class,embarked,fare,parch,sibsp,age,sex,pclass,survived
0,False,no,Southampton,,True,man,Third,S,7.25,0,1,22.0,male,3,0
1,False,yes,Cherbourg,C,False,woman,First,C,71.2833,0,1,38.0,female,1,1
2,True,yes,Southampton,,False,woman,Third,S,7.925,0,0,26.0,female,3,1
3,False,yes,Southampton,C,False,woman,First,S,53.1,0,1,35.0,female,1,1
4,True,no,Southampton,,True,man,Third,S,8.05,0,0,35.0,male,3,0


# 8- Select a column by data type

In [6]:
df.dtypes

survived          int64
pclass            int64
sex              object
age             float64
sibsp             int64
parch             int64
fare            float64
embarked         object
class          category
who              object
adult_male         bool
deck           category
embark_town      object
alive            object
alone              bool
dtype: object

> ***only select those have numeric type***

In [10]:
df.select_dtypes(include=["number"]).head()

Unnamed: 0,survived,pclass,age,sibsp,parch,fare
0,0,3,22.0,1,0,7.25
1,1,1,38.0,1,0,71.2833
2,1,3,26.0,0,0,7.925
3,1,1,35.0,1,0,53.1
4,0,3,35.0,0,0,8.05


In [None]:
# those have object type
df.select_dtypes(include=["object"]).head()

Unnamed: 0,sex,embarked,who,embark_town,alive
0,male,S,man,Southampton,no
1,female,C,woman,Cherbourg,yes
2,female,S,woman,Southampton,yes
3,female,S,woman,Southampton,yes
4,male,S,man,Southampton,no


In [16]:
# those have multiple type
df.select_dtypes(include=["object", "number"]).head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,who,embark_town,alive
0,0,3,male,22.0,1,0,7.25,S,man,Southampton,no
1,1,1,female,38.0,1,0,71.2833,C,woman,Cherbourg,yes
2,1,3,female,26.0,0,0,7.925,S,woman,Southampton,yes
3,1,1,female,35.0,1,0,53.1,S,woman,Southampton,yes
4,0,3,male,35.0,0,0,8.05,S,man,Southampton,no


exclude is same as include

# 9- Convert strings to number

In [24]:
df=pd.DataFrame({'col_A': ['1','2','3','4','5','5','4','3','2','1'],
            'col_B':['6','7','8','9','6','7','8','9','5','5']})

df.dtypes

col_A    object
col_B    object
dtype: object

In [25]:
df.astype({'col_A': 'int64','col_B':'int64'}).dtypes

col_A    int64
col_B    int64
dtype: object

# 10- Reduce data frame size

In [28]:
df = sns.load_dataset('titanic')
df.shape

(891, 15)

In [32]:
df.sample(frac=0.1).shape
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 15 columns):
 #   Column       Non-Null Count  Dtype   
---  ------       --------------  -----   
 0   survived     891 non-null    int64   
 1   pclass       891 non-null    int64   
 2   sex          891 non-null    object  
 3   age          714 non-null    float64 
 4   sibsp        891 non-null    int64   
 5   parch        891 non-null    int64   
 6   fare         891 non-null    float64 
 7   embarked     889 non-null    object  
 8   class        891 non-null    category
 9   who          891 non-null    object  
 10  adult_male   891 non-null    bool    
 11  deck         203 non-null    category
 12  embark_town  889 non-null    object  
 13  alive        891 non-null    object  
 14  alone        891 non-null    bool    
dtypes: bool(2), category(2), float64(2), int64(4), object(5)
memory usage: 80.7+ KB


# 11- Copy from clipboard

In [None]:
# dataset download
import seaborn as sns
import pandas as pd
df= sns.load_dataset("titanic") # loading data by using seaborn library
df.to_excel("kashti.xlsx")      # converting into excel 

In [7]:
# read clipboard in python
df = pd.read_clipboard()             # 1 copy then paste this command
df.to_csv("excel_hua_wa_data.xlsx")


# 12- Split dataframe into two subsets

In [14]:
import pandas as pd
import seaborn as sns
df=sns.load_dataset("titanic")
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [20]:
df.shape
len(df)

891

In [25]:
from random import random
kashti_1 = df.sample(frac= 0.50,random_state=1)
kashti_1
len(kashti_1)

446

In [27]:
kashti_2 = df.drop(kashti_1.index)
kashti_2.shape

(445, 15)

In [29]:
kashti_1.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
862,1,1,female,48.0,0,0,25.9292,S,First,woman,False,D,Southampton,yes,True
223,0,3,male,,0,0,7.8958,S,Third,man,True,,Southampton,no,True
84,1,2,female,17.0,0,0,10.5,S,Second,woman,False,,Southampton,yes,True
680,0,3,female,,0,0,8.1375,Q,Third,woman,False,,Queenstown,no,True
535,1,2,female,7.0,0,2,26.25,S,Second,child,False,,Southampton,yes,False


In [30]:
kashti_2.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
7,0,3,male,2.0,3,1,21.075,S,Third,child,False,,Southampton,no,False
10,1,3,female,4.0,1,1,16.7,S,Third,child,False,G,Southampton,yes,False
15,1,2,female,55.0,0,0,16.0,S,Second,woman,False,,Southampton,yes,True
18,0,3,female,31.0,1,0,18.0,S,Third,woman,False,,Southampton,no,False


In [31]:
len(kashti_1)+len(kashti_2)

891

# 13- Join two sets

In [41]:
df1 = pd.concat([kashti_1,kashti_2])
df1.shape

(891, 15)

# 14- Filtering a data set

In [42]:
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [44]:
df.sex.unique()

array(['male', 'female'], dtype=object)

In [45]:
df[(df.sex=="female")]

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.9250,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1000,S,First,woman,False,C,Southampton,yes,False
8,1,3,female,27.0,0,2,11.1333,S,Third,woman,False,,Southampton,yes,False
9,1,2,female,14.0,1,0,30.0708,C,Second,child,False,,Cherbourg,yes,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
880,1,2,female,25.0,0,1,26.0000,S,Second,woman,False,,Southampton,yes,False
882,0,3,female,22.0,0,0,10.5167,S,Third,woman,False,,Southampton,no,True
885,0,3,female,39.0,0,5,29.1250,Q,Third,woman,False,,Queenstown,no,False
887,1,1,female,19.0,0,0,30.0000,S,First,woman,False,B,Southampton,yes,True


In [46]:
df.age.unique

<bound method Series.unique of 0      22.0
1      38.0
2      26.0
3      35.0
4      35.0
       ... 
886    27.0
887    19.0
888     NaN
889    26.0
890    32.0
Name: age, Length: 891, dtype: float64>

In [49]:
df.embark_town.unique()
df[(df.embark_town=="Southampton")]

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.2500,S,Third,man,True,,Southampton,no,False
2,1,3,female,26.0,0,0,7.9250,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1000,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.0500,S,Third,man,True,,Southampton,no,True
6,0,1,male,54.0,0,0,51.8625,S,First,man,True,E,Southampton,no,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
883,0,2,male,28.0,0,0,10.5000,S,Second,man,True,,Southampton,no,True
884,0,3,male,25.0,0,0,7.0500,S,Third,man,True,,Southampton,no,True
886,0,2,male,27.0,0,0,13.0000,S,Second,man,True,,Southampton,no,True
887,1,1,female,19.0,0,0,30.0000,S,First,woman,False,B,Southampton,yes,True


# 15- Filtering by large category

# 16- Splitting a string into multiple columns

In [73]:
# import libraries
import pandas as pd

df = pd.DataFrame({"name":['ali','ahmad','subhan'],
                    "location":['india','pakistan','china']})
df

Unnamed: 0,name,location
0,ali,india
1,ahmad,pakistan
2,subhan,china


# 17- Aggregate by multiple groups or function

In [76]:
import pandas as pd
import seaborn as sns
df=sns.load_dataset("titanic")
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [77]:
df.groupby('who').count()

Unnamed: 0_level_0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,adult_male,deck,embark_town,alive,alone
who,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
child,83,83,83,83,83,83,83,83,83,83,13,83,83,83
man,537,537,537,413,537,537,537,537,537,537,99,537,537,537
woman,271,271,271,218,271,271,271,269,271,271,91,269,271,271


In [79]:
df.groupby('sex').count()

Unnamed: 0_level_0,survived,pclass,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
sex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
female,314,314,261,314,314,314,312,314,314,314,97,312,314,314
male,577,577,453,577,577,577,577,577,577,577,106,577,577,577


In [81]:
df.groupby('who').count()

Unnamed: 0_level_0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,adult_male,deck,embark_town,alive,alone
who,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
child,83,83,83,83,83,83,83,83,83,83,13,83,83,83
man,537,537,537,413,537,537,537,537,537,537,99,537,537,537
woman,271,271,271,218,271,271,271,269,271,271,91,269,271,271


In [84]:
len(df.groupby("who"))

3

In [87]:
df.groupby(['sex','pclass','who']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,survived,age,sibsp,parch,fare,embarked,class,adult_male,deck,embark_town,alive,alone
sex,pclass,who,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
female,1,child,3,3,3,3,3,3,3,3,3,3,3,3
female,1,woman,91,82,91,91,91,89,91,91,78,89,91,91
female,2,child,10,10,10,10,10,10,10,10,1,10,10,10
female,2,woman,66,64,66,66,66,66,66,66,9,66,66,66
female,3,child,30,30,30,30,30,30,30,30,2,30,30,30
female,3,woman,114,72,114,114,114,114,114,114,4,114,114,114
male,1,child,3,3,3,3,3,3,3,3,3,3,3,3
male,1,man,119,98,119,119,119,119,119,119,91,119,119,119
male,2,child,9,9,9,9,9,9,9,9,3,9,9,9
male,2,man,99,90,99,99,99,99,99,99,3,99,99,99


# 18- Select specific rows and columns

In [88]:
df

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.2500,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.9250,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1000,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.0500,S,Third,man,True,,Southampton,no,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,0,2,male,27.0,0,0,13.0000,S,Second,man,True,,Southampton,no,True
887,1,1,female,19.0,0,0,30.0000,S,First,woman,False,B,Southampton,yes,True
888,0,3,female,,1,2,23.4500,S,Third,woman,False,,Southampton,no,False
889,1,1,male,26.0,0,0,30.0000,C,First,man,True,C,Cherbourg,yes,True


In [89]:
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [None]:
# select column
df[['sex','class']]

Unnamed: 0,sex,class
0,male,Third
1,female,First
2,female,Third
3,female,First
4,male,Third
...,...,...
886,male,Second
887,female,First
888,female,Third
889,male,First


In [97]:
# select rows 
df.describe()

Unnamed: 0,survived,pclass,age,sibsp,parch,fare
count,891.0,891.0,714.0,891.0,891.0,891.0
mean,0.383838,2.308642,29.699118,0.523008,0.381594,32.204208
std,0.486592,0.836071,14.526497,1.102743,0.806057,49.693429
min,0.0,1.0,0.42,0.0,0.0,0.0
25%,0.0,2.0,20.125,0.0,0.0,7.9104
50%,0.0,3.0,28.0,0.0,0.0,14.4542
75%,1.0,3.0,38.0,1.0,0.0,31.0
max,1.0,3.0,80.0,8.0,6.0,512.3292


In [99]:
df.describe().loc['min':'max']

Unnamed: 0,survived,pclass,age,sibsp,parch,fare
min,0.0,1.0,0.42,0.0,0.0,0.0
25%,0.0,2.0,20.125,0.0,0.0,7.9104
50%,0.0,3.0,28.0,0.0,0.0,14.4542
75%,1.0,3.0,38.0,1.0,0.0,31.0
max,1.0,3.0,80.0,8.0,6.0,512.3292


# 19- Reshape multiindex series

In [5]:
import pandas as pd
import seaborn as sns
df=sns.load_dataset('titanic')
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [6]:
df.survived.mean()

np.float64(0.3838383838383838)

In [8]:
df.groupby('sex').survived.mean()

sex
female    0.742038
male      0.188908
Name: survived, dtype: float64

In [11]:
df.groupby(['sex','class']).survived.mean()

  df.groupby(['sex','class']).survived.mean()


sex     class 
female  First     0.968085
        Second    0.921053
        Third     0.500000
male    First     0.368852
        Second    0.157407
        Third     0.135447
Name: survived, dtype: float64

# 20- Continuous data to catagorical data conversion

In [12]:
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [13]:
df.age.head()

0    22.0
1    38.0
2    26.0
3    35.0
4    35.0
Name: age, dtype: float64

In [15]:
# creating bins
pd.cut(df.age, bins = [0,18,25,99], labels = ['child','young_adult','adult']).head()
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


# 21- convert one set of values in another

In [17]:
df.sex.head()

0      male
1    female
2    female
3    female
4      male
Name: sex, dtype: object

In [18]:
df.sex.map({'male':'0','female':'1'})

0      0
1      1
2      1
3      1
4      0
      ..
886    0
887    1
888    1
889    0
890    0
Name: sex, Length: 891, dtype: object

In [21]:
df['sex_num']=df.sex.map({'male':0,'female':1})
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone,sex_num
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False,0
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False,1
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True,1
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False,1
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True,0


In [25]:
df.embarked.factorize()[0]

array([ 0,  1,  0,  0,  0,  2,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  2,
        0,  0,  1,  0,  0,  2,  0,  0,  0,  1,  0,  2,  0,  1,  1,  2,  0,
        1,  0,  1,  0,  0,  1,  0,  0,  1,  1,  2,  0,  2,  2,  1,  0,  0,
        0,  1,  0,  1,  0,  0,  1,  0,  0,  1, -1,  0,  0,  1,  1,  0,  0,
        0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0,  0,  2,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  1,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  2,  0,  1,  0,  0,  1,  0,  2,  0,  1,
        0,  0,  0,  1,  0,  0,  1,  2,  0,  1,  0,  1,  0,  0,  0,  0,  1,
        0,  0,  0,  1,  1,  0,  0,  2,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  1,  2,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  2,  0,  0,  1,  0,  0,  1,  0,  0,  0,  1,  0,  0,  0,  0,  2,
        0,  2,  0,  0,  0,  0,  0,  1,  1,  2,  0,  2,  0,  0,  0,  0,  1,
        0,  0,  0,  1,  2,  1,  0,  0,  0,  0,  2,  1,  0,  0,  1,  0,  0,
        0,  0,  0,  0,  0

# 22- Transpose a wide dataframe

In [26]:
import numpy as np
import pandas as pd

In [31]:
# creating a new data frame
df=pd.DataFrame(np.random.rand(200,25), columns  = list('abcdefghijklmnopqrstuvwxy'))
df.head()

Unnamed: 0,a,b,c,d,e,f,g,h,i,j,...,p,q,r,s,t,u,v,w,x,y
0,0.937062,0.789528,0.170686,0.861162,0.172689,0.841249,0.797055,0.11256,0.069078,0.523638,...,0.942641,0.440491,0.929451,0.349218,0.179783,0.788705,0.085859,0.077687,0.660168,0.68112
1,0.488859,0.4881,0.22989,0.762488,0.459775,0.289417,0.666233,0.273688,0.615747,0.102813,...,0.266846,0.055911,0.285034,0.014778,0.658612,0.297928,0.871007,0.643628,0.577464,0.878648
2,0.601853,0.264928,0.93924,0.347919,0.160011,0.218702,0.350459,0.580445,0.171754,0.199986,...,0.014727,0.743637,0.7608,0.629501,0.349409,0.58388,0.263022,0.612148,0.770023,0.485541
3,0.420157,0.254955,0.463687,0.863203,0.903504,0.551129,0.304376,0.966581,0.087405,0.378386,...,0.722563,0.576408,0.093862,0.785023,0.261158,0.357358,0.791729,0.630598,0.703117,0.109667
4,0.389676,0.972613,0.725844,0.649048,0.574585,0.865675,0.240118,0.104128,0.47607,0.313249,...,0.932804,0.16651,0.59457,0.02124,0.934643,0.637963,0.648643,0.311694,0.814577,0.489999


In [32]:
df.head(10).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
a,0.937062,0.488859,0.601853,0.420157,0.389676,0.284119,0.155402,0.297942,0.542679,0.364797
b,0.789528,0.4881,0.264928,0.254955,0.972613,0.406206,0.915329,0.559056,0.592816,0.143861
c,0.170686,0.22989,0.93924,0.463687,0.725844,0.32501,0.814666,0.831234,0.435576,0.208305
d,0.861162,0.762488,0.347919,0.863203,0.649048,0.217717,0.701279,0.646514,0.1431,0.891275
e,0.172689,0.459775,0.160011,0.903504,0.574585,0.787221,0.675689,0.865397,0.727568,0.706458
f,0.841249,0.289417,0.218702,0.551129,0.865675,0.058034,0.449727,0.253291,0.787456,0.141682
g,0.797055,0.666233,0.350459,0.304376,0.240118,0.835809,0.322124,0.279903,0.141864,0.869423
h,0.11256,0.273688,0.580445,0.966581,0.104128,0.216601,0.050178,0.776834,0.246897,0.214129
i,0.069078,0.615747,0.171754,0.087405,0.47607,0.80264,0.047574,0.172178,0.626861,0.413207
j,0.523638,0.102813,0.199986,0.378386,0.313249,0.531253,0.059953,0.590409,0.426147,0.825555


In [34]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
a,200.0,0.497869,0.286999,0.004861,0.227389,0.516634,0.743267,0.995356
b,200.0,0.499871,0.271408,0.013483,0.288954,0.50938,0.755092,0.985312
c,200.0,0.498896,0.272799,0.003776,0.289544,0.469595,0.722852,0.992269
d,200.0,0.519619,0.286153,0.011257,0.291801,0.524707,0.793005,0.989762
e,200.0,0.484516,0.282723,0.002883,0.233792,0.535341,0.696267,0.989242
f,200.0,0.524687,0.289704,0.000794,0.263146,0.544772,0.781546,0.995723
g,200.0,0.484997,0.286245,0.006415,0.235643,0.460367,0.725968,0.999814
h,200.0,0.464292,0.293219,0.007104,0.216258,0.444524,0.709624,0.996562
i,200.0,0.491285,0.27983,0.002144,0.287349,0.48843,0.703643,0.994559
j,200.0,0.458188,0.298367,0.000324,0.196334,0.427781,0.6916,0.999763


# 23- Reshaping a data frame