# LEARN ABOUT PANDAS

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

### Library Highlights

A fast and efficient DataFrame object for data manipulation with integrated indexing;

Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;

Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form;

Flexible reshaping and pivoting of data sets;

Intelligent label-based slicing, fancy indexing, and subsetting of large data sets;

Columns can be inserted and deleted from data structures for size mutability;

Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets;

High performance merging and joining of data sets;

Hierarchical axis indexing provides an intuitive way of working with high-dimensional data in a lower-dimensional data structure;

Time series-functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data;

Highly optimized for performance, with critical code paths written in Cython or C.

Python with pandas is in use in a wide variety of academic and commercial domains, including Finance, Neuroscience, Economics, Statistics, Advertising, Web Analytics, and more.

### Agenda of Notebook

- What is Data Frames?
- What is Data Series?
- Different operation in Pandas

In [1]:
# First step is to import pandas
import pandas as pd
import numpy as np

In [2]:
# Playing with Dataframe
df=pd.DataFrame(np.arange(0,20).reshape(5,4),index=['Row1','Row2','Row3','Row4','Row5'],columns=["Column1","Column2","Column3","Coumn4"])

In [3]:
#shows entries mention in brakets the number you like it will show you that much entries 
#by deafult it will show only first five
df.head()

Unnamed: 0,Column1,Column2,Column3,Coumn4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


In [4]:
# Accessing the elements = tells you details about particular row
df.loc['Row1']

Column1    0
Column2    1
Column3    2
Coumn4     3
Name: Row1, dtype: int32

In [5]:
# Check the type
type(df.loc['Row1'])

pandas.core.series.Series

In [6]:
df.iloc[:,:] #from starting to end

Unnamed: 0,Column1,Column2,Column3,Coumn4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


In [7]:
## Take the elements from the Column3
df.iloc[:,2:]

Unnamed: 0,Column3,Coumn4
Row1,2,3
Row2,6,7
Row3,10,11
Row4,14,15
Row5,18,19


In [8]:
#see in form of arrry - convert Dataframes into array
df.iloc[:,1:].values

array([[ 1,  2,  3],
       [ 5,  6,  7],
       [ 9, 10, 11],
       [13, 14, 15],
       [17, 18, 19]])

In [9]:
df['Column1'].value_counts()

12    1
4     1
16    1
8     1
0     1
Name: Column1, dtype: int64

In [10]:
#reading dataset usign pandas which is in csv format
df=pd.read_csv('DS MercBenz.csv')

In [11]:
df.head()

Unnamed: 0,ID,y,X0,X1,X2,X3,X4,X5,X6,X8,...,X375,X376,X377,X378,X379,X380,X382,X383,X384,X385
0,0,130.81,k,v,at,a,d,u,j,o,...,0,0,1,0,0,0,0,0,0,0
1,6,88.53,k,t,av,e,d,y,l,o,...,1,0,0,0,0,0,0,0,0,0
2,7,76.26,az,w,n,c,d,x,j,x,...,0,0,0,0,0,0,1,0,0,0
3,9,80.62,az,t,n,f,d,x,l,e,...,0,0,0,0,0,0,0,0,0,0
4,13,78.02,az,v,n,f,d,h,d,n,...,0,0,0,0,0,0,0,0,0,0


In [12]:
#information about dataset
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4209 entries, 0 to 4208
Columns: 378 entries, ID to X385
dtypes: float64(1), int64(369), object(8)
memory usage: 12.1+ MB


In [13]:
#get some description about your dataset
df.describe()

Unnamed: 0,ID,y,X10,X11,X12,X13,X14,X15,X16,X17,...,X375,X376,X377,X378,X379,X380,X382,X383,X384,X385
count,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,...,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0,4209.0
mean,4205.960798,100.669318,0.013305,0.0,0.075077,0.057971,0.42813,0.000475,0.002613,0.007603,...,0.318841,0.057258,0.314802,0.02067,0.009503,0.008078,0.007603,0.001663,0.000475,0.001426
std,2437.608688,12.679381,0.11459,0.0,0.263547,0.233716,0.494867,0.021796,0.051061,0.086872,...,0.466082,0.232363,0.464492,0.142294,0.097033,0.089524,0.086872,0.040752,0.021796,0.037734
min,0.0,72.11,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,2095.0,90.82,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,4220.0,99.15,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,6314.0,109.01,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,8417.0,265.32,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [14]:
#Get the unique category counts
df['X0'].value_counts()

z     360
ak    349
y     324
ay    313
t     306
x     300
o     269
f     227
n     195
w     182
j     181
az    175
aj    151
s     106
ap    103
h      75
d      73
al     67
v      36
af     35
m      34
ai     34
e      32
ba     27
at     25
a      21
ax     19
aq     18
am     18
i      18
u      17
aw     16
l      16
ad     14
k      11
au     11
b      11
r      10
as     10
bc      6
ao      4
c       3
aa      2
q       2
ac      1
g       1
ab      1
Name: X0, dtype: int64

In [15]:
df[df['y']>100]

Unnamed: 0,ID,y,X0,X1,X2,X3,X4,X5,X6,X8,...,X375,X376,X377,X378,X379,X380,X382,X383,X384,X385
0,0,130.81,k,v,at,a,d,u,j,o,...,0,0,1,0,0,0,0,0,0,0
6,24,128.76,al,r,e,f,d,f,h,s,...,0,0,0,0,0,0,0,0,0,0
8,27,108.67,w,s,as,e,d,f,i,h,...,1,0,0,0,0,0,0,0,0,0
9,30,126.99,j,b,aq,c,d,f,a,e,...,0,0,1,0,0,0,0,0,0,0
10,31,102.09,h,r,r,f,d,f,h,p,...,0,0,0,0,0,0,0,0,0,0
16,39,106.10,al,r,e,f,d,j,h,o,...,0,0,0,0,0,0,0,0,0,0
17,40,114.13,ay,aa,as,c,d,j,c,d,...,0,0,1,0,0,0,0,0,0,0
22,50,108.14,x,r,a,b,d,j,j,o,...,1,0,0,0,0,0,0,0,0,0
23,52,120.77,x,b,m,c,d,j,j,n,...,0,0,1,0,0,0,0,0,0,0
26,61,104.07,aj,l,as,f,d,j,g,j,...,0,0,0,0,0,0,0,0,0,0


In [16]:
df.corr()

Unnamed: 0,ID,y,X10,X11,X12,X13,X14,X15,X16,X17,...,X375,X376,X377,X378,X379,X380,X382,X383,X384,X385
ID,1.000000,-0.055108,0.001602,,0.058988,-0.031917,-0.025438,0.002237,-0.036480,-0.038171,...,0.045229,-0.080259,-0.022965,0.030371,0.023382,-0.013577,-0.038171,-0.009332,-0.015355,0.029059
y,-0.055108,1.000000,-0.026985,,0.089792,0.048276,0.193643,0.023116,0.048946,-0.159815,...,0.029100,0.114005,0.061403,-0.258679,0.067919,0.040932,-0.159815,0.040291,-0.004591,-0.022280
X10,0.001602,-0.026985,1.000000,,-0.033084,-0.028806,-0.100474,-0.002532,-0.005944,-0.010164,...,0.165277,-0.028618,-0.074244,-0.016870,-0.011374,-0.010479,-0.010164,-0.004740,-0.002532,-0.004387
X11,,,,,,,,,,,...,,,,,,,,,,
X12,0.058988,0.089792,-0.033084,,1.000000,0.214825,-0.246513,-0.006212,-0.014584,-0.024937,...,-0.107864,-0.070214,0.030134,-0.016043,-0.027907,-0.005566,-0.024937,-0.011628,-0.006212,-0.010765
X13,-0.031917,0.048276,-0.028806,,0.214825,1.000000,-0.083141,-0.005409,-0.012698,-0.021713,...,-0.169721,-0.061136,0.357229,-0.036040,-0.024299,0.023045,-0.021713,-0.010125,0.041242,-0.009373
X14,-0.025438,0.193643,-0.100474,,-0.246513,-0.083141,1.000000,-0.018865,-0.044291,0.012713,...,0.118950,0.026496,-0.097464,-0.037958,0.103309,0.007743,0.012713,0.023604,0.025199,0.043667
X15,0.002237,0.023116,-0.002532,,-0.006212,-0.005409,-0.018865,1.000000,-0.001116,-0.001908,...,-0.014917,-0.005373,0.032168,-0.003168,-0.002136,-0.001968,-0.001908,-0.000890,-0.000475,-0.000824
X16,-0.036480,0.048946,-0.005944,,-0.014584,-0.012698,-0.044291,-0.001116,1.000000,-0.004480,...,-0.025036,-0.012615,0.045461,-0.007437,-0.005014,-0.004619,-0.004480,-0.002089,-0.001116,-0.001934
X17,-0.038171,-0.159815,-0.010164,,-0.024937,-0.021713,0.012713,-0.001908,-0.004480,1.000000,...,-0.059883,-0.021571,-0.059327,-0.012716,-0.008573,-0.007899,1.000000,-0.003572,-0.001908,-0.003307


In [17]:
df['X11'].value_counts()

0    4209
Name: X11, dtype: int64

In [18]:
#importing numpy library
import numpy as np

In [19]:
lst_data=[[1,2,3],[3,4,np.nan],[5,6,np.nan],[np.nan,np.nan,np.nan]]

In [20]:
df=pd.DataFrame(lst_data)

In [21]:
df.head()

Unnamed: 0,0,1,2
0,1.0,2.0,3.0
1,3.0,4.0,
2,5.0,6.0,
3,,,


In [22]:
#here you will see how to handle missing values

# Handling Missing Values

##Drop NaN (not a nummber) values

df.dropna(axis=0)

Unnamed: 0,0,1,2
0,1.0,2.0,3.0


In [23]:
df.dropna(axis=1)

0
1
2
3


In [24]:
df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],
                     columns=['one', 'two', 'three'])

In [25]:
df.head()

Unnamed: 0,one,two,three
a,-1.359816,-0.175973,-0.521431
c,0.747194,0.507305,-0.216071
e,-0.297323,-0.151944,-1.649469
f,-0.53283,0.197842,0.990465
h,0.066165,2.024749,-0.465154


In [26]:
df2=df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])

In [27]:
df2 #lets see dataframe 2

Unnamed: 0,one,two,three
a,-1.359816,-0.175973,-0.521431
b,,,
c,0.747194,0.507305,-0.216071
d,,,
e,-0.297323,-0.151944,-1.649469
f,-0.53283,0.197842,0.990465
g,,,
h,0.066165,2.024749,-0.465154


In [28]:
df2.dropna(axis=0)

Unnamed: 0,one,two,three
a,-1.359816,-0.175973,-0.521431
c,0.747194,0.507305,-0.216071
e,-0.297323,-0.151944,-1.649469
f,-0.53283,0.197842,0.990465
h,0.066165,2.024749,-0.465154


In [29]:
pd.isna(df2['one'])

a    False
b     True
c    False
d     True
e    False
f    False
g     True
h    False
Name: one, dtype: bool

In [30]:
df2['one'].notna()

a     True
b    False
c     True
d    False
e     True
f     True
g    False
h     True
Name: one, dtype: bool

In [31]:
df2.fillna('Missing') #filling missing were NaN is written

Unnamed: 0,one,two,three
a,-1.35982,-0.175973,-0.521431
b,Missing,Missing,Missing
c,0.747194,0.507305,-0.216071
d,Missing,Missing,Missing
e,-0.297323,-0.151944,-1.64947
f,-0.53283,0.197842,0.990465
g,Missing,Missing,Missing
h,0.0661653,2.02475,-0.465154


In [32]:
df2['one'].values

array([-1.35981591,         nan,  0.74719432,         nan, -0.29732286,
       -0.53283014,         nan,  0.06616531])

In [33]:
# Reading different data sources with the help of pandas

### CSV

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format. A CSV file typically stores tabular data (numbers and text) in plain text, in which case each line will have the same number of fields.

In [34]:
from io import StringIO, BytesIO

In [35]:
data = ('col1,col2,col3\n'
            'x,y,1\n'
            'a,b,2\n'
            'c,d,3')

In [36]:
type(data)

str

In [37]:
pd.read_csv(StringIO(data))

Unnamed: 0,col1,col2,col3
0,x,y,1
1,a,b,2
2,c,d,3


In [38]:
# Read from specific columns
df=pd.read_csv(StringIO(data), usecols=lambda x: x.upper() in ['COL1', 'COL3'])

In [39]:
df.to_csv('DS TestIT.csv')

In [40]:
# Specifying columns data types

data = ('a,b,c,d\n'
            '1,2,3,4\n'
            '5,6,7,8\n'
            '9,10,11')

In [41]:
print(data)

a,b,c,d
1,2,3,4
5,6,7,8
9,10,11


In [42]:
df=pd.read_csv(StringIO(data),dtype=object)

In [43]:
df

Unnamed: 0,a,b,c,d
0,1,2,3,4.0
1,5,6,7,8.0
2,9,10,11,


In [44]:
df['a'][1]

'5'

In [45]:
df=pd.read_csv(StringIO(data),dtype={'b':int,'c':np.float,'a':'Int64'})

In [46]:
df

Unnamed: 0,a,b,c,d
0,1,2,3.0,4.0
1,5,6,7.0,8.0
2,9,10,11.0,


In [47]:
df['a'][1]

5

In [48]:
# check the datatype
df.dtypes

a      Int64
b      int32
c    float64
d    float64
dtype: object

In [49]:
# Index columns and training delimiters

In [50]:
data = ('index,a,b,c\n'
           '4,apple,bat,5.7\n'
            '8,orange,cow,10')

In [51]:
pd.read_csv(StringIO(data),index_col=0)

Unnamed: 0_level_0,a,b,c
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
4,apple,bat,5.7
8,orange,cow,10.0


In [52]:
 data = ('a,b,c\n'
           '4,apple,bat,\n'
            '8,orange,cow,')

In [53]:
pd.read_csv(StringIO(data))

Unnamed: 0,a,b,c
4,apple,bat,
8,orange,cow,


In [54]:
pd.read_csv(StringIO(data),index_col=False)

Unnamed: 0,a,b,c
0,4,apple,bat
1,8,orange,cow


In [55]:
# Combining usecols and index_col
data = ("a,b,c\n"
           "4,apple,bat,\n"
            "8,orange,cow,")

In [56]:
pd.read_csv(StringIO(data), usecols=['b', 'c'],index_col=False)

Unnamed: 0,b,c
0,apple,bat
1,orange,cow


In [57]:
# Quoting and Escape Characters. Very useful in NLP

data = 'a,b\n"hello, \\"Bob\\", nice to see you",5'

In [58]:
pd.read_csv(StringIO(data),escapechar='\\')

Unnamed: 0,a,b
0,"hello, ""Bob"", nice to see you",5


In [59]:
# URL to CSV

df=pd.read_csv('https://download.bls.gov/pub/time.series/cu/cu.item',
                 sep='\t')

In [60]:
df.head()

Unnamed: 0,item_code,item_name,display_level,selectable,sort_sequence
0,AA0,All items - old base,0,T,2
1,AA0R,Purchasing power of the consumer dollar - old ...,0,T,399
2,SA0,All items,0,T,1
3,SA0E,Energy,1,T,374
4,SA0L1,All items less food,1,T,358


In [61]:
## Read Json to CSV

In [62]:
Data = '{"employee_name": "James", "email": "james@gmail.com", "job_profile": [{"title1":"Team Lead", "title2":"Sr. Developer"}]}'
pd.read_json(Data)

Unnamed: 0,employee_name,email,job_profile
0,James,james@gmail.com,"{'title1': 'Team Lead', 'title2': 'Sr. Develop..."


In [63]:
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data', header=None)

In [64]:
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13
0,1,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065
1,1,13.2,1.78,2.14,11.2,100,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050
2,1,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185
3,1,14.37,1.95,2.5,16.8,113,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480
4,1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735


In [65]:
# convert Json to csv

In [66]:
df.to_csv('DS Lets have some wine.csv')

In [67]:
# convert Json to different json formats
df.to_json(orient="index")

'{"0":{"0":1,"1":14.23,"2":1.71,"3":2.43,"4":15.6,"5":127,"6":2.8,"7":3.06,"8":0.28,"9":2.29,"10":5.64,"11":1.04,"12":3.92,"13":1065},"1":{"0":1,"1":13.2,"2":1.78,"3":2.14,"4":11.2,"5":100,"6":2.65,"7":2.76,"8":0.26,"9":1.28,"10":4.38,"11":1.05,"12":3.4,"13":1050},"2":{"0":1,"1":13.16,"2":2.36,"3":2.67,"4":18.6,"5":101,"6":2.8,"7":3.24,"8":0.3,"9":2.81,"10":5.68,"11":1.03,"12":3.17,"13":1185},"3":{"0":1,"1":14.37,"2":1.95,"3":2.5,"4":16.8,"5":113,"6":3.85,"7":3.49,"8":0.24,"9":2.18,"10":7.8,"11":0.86,"12":3.45,"13":1480},"4":{"0":1,"1":13.24,"2":2.59,"3":2.87,"4":21.0,"5":118,"6":2.8,"7":2.69,"8":0.39,"9":1.82,"10":4.32,"11":1.04,"12":2.93,"13":735},"5":{"0":1,"1":14.2,"2":1.76,"3":2.45,"4":15.2,"5":112,"6":3.27,"7":3.39,"8":0.34,"9":1.97,"10":6.75,"11":1.05,"12":2.85,"13":1450},"6":{"0":1,"1":14.39,"2":1.87,"3":2.45,"4":14.6,"5":96,"6":2.5,"7":2.52,"8":0.3,"9":1.98,"10":5.25,"11":1.02,"12":3.58,"13":1290},"7":{"0":1,"1":14.06,"2":2.15,"3":2.61,"4":17.6,"5":121,"6":2.6,"7":2.51,"8":0.3

In [68]:
df.to_json(orient="records")

'[{"0":1,"1":14.23,"2":1.71,"3":2.43,"4":15.6,"5":127,"6":2.8,"7":3.06,"8":0.28,"9":2.29,"10":5.64,"11":1.04,"12":3.92,"13":1065},{"0":1,"1":13.2,"2":1.78,"3":2.14,"4":11.2,"5":100,"6":2.65,"7":2.76,"8":0.26,"9":1.28,"10":4.38,"11":1.05,"12":3.4,"13":1050},{"0":1,"1":13.16,"2":2.36,"3":2.67,"4":18.6,"5":101,"6":2.8,"7":3.24,"8":0.3,"9":2.81,"10":5.68,"11":1.03,"12":3.17,"13":1185},{"0":1,"1":14.37,"2":1.95,"3":2.5,"4":16.8,"5":113,"6":3.85,"7":3.49,"8":0.24,"9":2.18,"10":7.8,"11":0.86,"12":3.45,"13":1480},{"0":1,"1":13.24,"2":2.59,"3":2.87,"4":21.0,"5":118,"6":2.8,"7":2.69,"8":0.39,"9":1.82,"10":4.32,"11":1.04,"12":2.93,"13":735},{"0":1,"1":14.2,"2":1.76,"3":2.45,"4":15.2,"5":112,"6":3.27,"7":3.39,"8":0.34,"9":1.97,"10":6.75,"11":1.05,"12":2.85,"13":1450},{"0":1,"1":14.39,"2":1.87,"3":2.45,"4":14.6,"5":96,"6":2.5,"7":2.52,"8":0.3,"9":1.98,"10":5.25,"11":1.02,"12":3.58,"13":1290},{"0":1,"1":14.06,"2":2.15,"3":2.61,"4":17.6,"5":121,"6":2.6,"7":2.51,"8":0.31,"9":1.25,"10":5.05,"11":1.06,"

### Reading HTML content 

Here, URL is there in which you add Online URL which has dataset in your notebook  

In [69]:
url = 'https://www.fdic.gov/bank/individual/failed/banklist.html' #here is online URL

dfs = pd.read_html(url) #reding onllune data

In [70]:
dfs[0]

Unnamed: 0,Bank Name,City,ST,CERT,Acquiring Institution,Closing Date
0,The First State Bank,Barboursville,WV,14361,"MVB Bank, Inc.","April 3, 2020"
1,Ericson State Bank,Ericson,NE,18265,Farmers and Merchants Bank,"February 14, 2020"
2,City National Bank of New Jersey,Newark,NJ,21111,Industrial Bank,"November 1, 2019"
3,Resolute Bank,Maumee,OH,58317,Buckeye State Bank,"October 25, 2019"
4,Louisa Community Bank,Louisa,KY,58112,Kentucky Farmers Bank Corporation,"October 25, 2019"
5,The Enloe State Bank,Cooper,TX,10716,"Legend Bank, N. A.","May 31, 2019"
6,Washington Federal Bank for Savings,Chicago,IL,30570,Royal Savings Bank,"December 15, 2017"
7,The Farmers and Merchants State Bank of Argonia,Argonia,KS,17719,Conway Bank,"October 13, 2017"
8,Fayette County Bank,Saint Elmo,IL,1802,"United Fidelity Bank, fsb","May 26, 2017"
9,"Guaranty Bank, (d/b/a BestBank in Georgia & Mi...",Milwaukee,WI,30003,First-Citizens Bank & Trust Company,"May 5, 2017"


In [71]:
#similar as above
url_mcc = 'https://en.wikipedia.org/wiki/Mobile_country_code'
dfs = pd.read_html(url_mcc, match='Country', header=0)

In [72]:
dfs[0]

Unnamed: 0,Mobile country code,Country,ISO 3166,Mobile network codes,National MNC authority,Remarks
0,289,A Abkhazia,GE-AB,List of mobile network codes in Abkhazia,,MCC is not listed by ITU
1,412,Afghanistan,AF,List of mobile network codes in Afghanistan,,
2,276,Albania,AL,List of mobile network codes in Albania,,
3,603,Algeria,DZ,List of mobile network codes in Algeria,,
4,544,American Samoa (United States of America),AS,List of mobile network codes in American Samoa,,
5,213,Andorra,AD,List of mobile network codes in Andorra,,
6,631,Angola,AO,List of mobile network codes in Angola,,
7,365,Anguilla (United Kingdom),AI,List of mobile network codes in Anguilla,Public Utilities Commission (PUC)[5],
8,344,Antigua and Barbuda,AG,List of mobile network codes in Antigua and Ba...,,
9,722,Argentina,AR,List of mobile network codes in Argentina,,


### Reading Excel Files

Reading excel files with data using pandas.

In [73]:
df_excel=pd.read_excel('DS Sample.xlsx')

In [74]:
df_excel.head()

Unnamed: 0.1,Unnamed: 0,a,b,c,d
0,0,0,1,2,3
1,1,4,5,6,7
2,2,8,9,10,11
3,3,12,13,14,15


### Pickling
All pandas objects are equipped with to_pickle methods which use Python’s cPickle module to save data structures to disk using the pickle format.

In [75]:
df_excel.to_pickle('df_excel')

In [76]:
df=pd.read_pickle('df_excel')

In [77]:
df.head()

Unnamed: 0.1,Unnamed: 0,a,b,c,d
0,0,0,1,2,3
1,1,4,5,6,7
2,2,8,9,10,11
3,3,12,13,14,15


## REFERENCES 

1. https://pandas.pydata.org/about/
2. https://en.wikipedia.org/wiki/Comma-separated_values