DataFrame Structure
DataFrame Structure The DataFrame is conceptually a two-dimensional series object, where there's an index and multiple columns of content, with each column having a label. In fact, the distinction between a column and a row is really only a conceptual distinction. And you can think of the DataFrame itself as simply a two-axes labeled array.

In [1]:
import pandas as pd

In [2]:
#Series to DataFrame
#Let's create records for countries and their life expectancy, income group, and GDPC
pak_record=pd.Series({'country':'PAK','life':66,'Income':'Lower','GDPC':1407})
ind_record=pd.Series({'country':'IND','life':68,'Income':'Lower','GDPC':2484.8})
bgd_record=pd.Series({'country':'BGD','life':74,'Income':'Lower','GDPC':2529.1})

In [4]:
# Like a Series, the DataFrame object is index. Here I'll use a group of series, where each series 
# represents a row of data. Just like the Series function, we can pass in our individual items
# in an array, and we can pass in our index values as a second arguments
df=pd.DataFrame([pak_record,ind_record,bgd_record],index=['country1','country2','country3'])
df.head(3)

Unnamed: 0,country,life,Income,GDPC
country1,PAK,66,Lower,1407.0
country2,IND,68,Lower,2484.8
country3,BGD,74,Lower,2529.1


In [7]:
#We can use list of Dictionary to DataFrame method to get the same result
countries=[{'country':'PAK','life':66,'Income':'Lower','GDPC':1407},
          {'country':'IND','life':68,'Income':'Lower','GDPC':2484.8},
          {'country':'BGD','life':74,'Income':'Lower','GDPC':2529.1}]
df=pd.DataFrame(countries,index=['country1','country2','country3'])
df.head()

Unnamed: 0,country,life,Income,GDPC
country1,PAK,66,Lower,1407.0
country2,IND,68,Lower,2484.8
country3,BGD,74,Lower,2529.1


In [9]:
#loc and iloc
#iloc and loc are used for row selection
df.loc["country3"]# row 3
df.iloc[2]

country       BGD
life           74
Income      Lower
GDPC       2529.1
Name: country3, dtype: object

In [11]:
type(df.loc["country3"])

pandas.core.series.Series

In [10]:
df.T

Unnamed: 0,country1,country2,country3
country,PAK,IND,BGD
life,66,68,74
Income,Lower,Lower,Lower
GDPC,1407.0,2484.8,2529.1


In [12]:
df.loc['country2','life'] # if we are only interested in country2 life expectancy

68

In [13]:
df.T.loc['country'] #Then we can call .loc on the transpose to get the country names only

country1    PAK
country2    IND
country3    BGD
Name: country, dtype: object

In [14]:
df['life'] #prints life column
 #you get a key error if you try and use .loc with a column name

country1    66
country2    68
country3    74
Name: life, dtype: int64

In [15]:
df.loc['country1']['GDPC']

1407.0

In [17]:
print(type(df.loc['country1']['GDPC']))
print(type(df.loc['country1']))

<class 'numpy.float64'>
<class 'pandas.core.series.Series'>
