Pandas DataFrame - Practice Tasks
This notebook contains practice examples for creating and working with Pandas Dataframes.

# Syntax
dataframe_name=pandas.DataFrame(data,columns=..,index=..,dtype=..)
     
     data: Actual data (lists, dictionaries, Series, etc.)
     columns: Names for the columns
     index: Row labels
     dtype: Data type for each column (optional)

# creating dataframes
DataFrames can be created using different Python data structures:
     
     1.List
     2.Series object
     3.Dictionary
     4.Numpy array

# Creating empty dataframe

In [54]:
import pandas as pd
df=pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []


1. list
# creating dataframe using single list with default indexing

In [9]:
#Example: Single list as a column
product= ["TV", "Mobile", "Laptop", "Tablet"]
df=pd.DataFrame(product)
print(df)

        0
0      TV
1  Mobile
2  Laptop
3  Tablet


# creating dataframe using custom index

In [52]:
sales_details=[["TV",10000],["Mobile",30000],["Laptop",50000],["Tablet",40000]]
product_details=['product1','product2','product3','product4']
df=pd.DataFrame(sales_details,index=product_details,columns=['product_name','sales_amount'])
print(df)

         product_name  sales_amount
product1           TV         10000
product2       Mobile         30000
product3       Laptop         50000
product4       Tablet         40000


2.Series object

# Creating a DataFrame from a single Series with default indexing

In [10]:
product= pd.Series(["TV", "Mobile", "Laptop", "Tablet"])
df=pd.DataFrame(product)
print(df)

        0
0      TV
1  Mobile
2  Laptop
3  Tablet


# creating dataframe using series and custom row index

In [53]:
product= pd.Series(["TV", "Mobile", "Laptop", "Tablet"],index=['product1','product2','product3','product4'])
df=pd.DataFrame(product)
print(df)

               0
product1      TV
product2  Mobile
product3  Laptop
product4  Tablet


# creating dataframe using two series object

In [12]:
product= pd.Series(["TV", "Mobile", "Laptop", "Tablet"],index=['product1','product2','product3','product4'])
jan_sales= pd.Series([10000,30000,50000,40000],index=['product1','product2','product3','product4'])
df=pd.DataFrame([product,jan_sales])
print(df)

  product1 product2 product3 product4
0       TV   Mobile   Laptop   Tablet
1    10000    30000    50000    40000


3. Dictionary
# creating dataframe using dictionary
You can use:
          -Dictionary of lists
          -Dictionary of series
          -List of dictionary

# dictionary of lists

In [15]:
product={'product':['TV','Mobile','Laptop','Tablet'],
         'jan_sales':[10000,30000,50000,40000],
         'feb_sales':[20000,15000,45000,30000],
         'mar_sales':[14000,26000,80000,70000]
}
df=pd.DataFrame(product)
print(df)

  product  jan_sales  feb_sales  mar_sales
0      TV      10000      20000      14000
1  Mobile      30000      15000      26000
2  Laptop      50000      45000      80000
3  Tablet      40000      30000      70000


# dictionary of series

In [18]:
product= pd.Series(["TV", "Mobile", "Laptop", "Tablet"])
jan =pd.Series([10000,30000,50000,40000])
feb =pd.Series([20000,15000,45000,30000])
mar =pd.Series([14000,26000,80000,70000])
product_details={'product_name':product,'jan_sales':jan,'feb_sales':feb,'mar_sales':mar}
df=pd.DataFrame(product_details)
print(df)

  product_name  jan_sales  feb_sales  mar_sales
0           TV      10000      20000      14000
1       Mobile      30000      15000      26000
2       Laptop      50000      45000      80000
3       Tablet      40000      30000      70000


# list of dictionary

In [27]:
product_details = [
    {"Product": "TV", "Jan": 10000, "Feb": 12000},
    {"Product": "Mobile", "Jan": 30000, "Feb": 32000},
    {"Product": "Laptop", "Jan": 50000, "Feb": 48000},
    {"Product": "Tablet", "Jan": 40000, "Feb": 45000}
]
df=pd.DataFrame(product_details,index=['product1','product2','product3','product4'])
print(df)

         Product    Jan    Feb
product1      TV  10000  12000
product2  Mobile  30000  32000
product3  Laptop  50000  48000
product4  Tablet  40000  45000


4.Numpy array

# creating dataframe using np array

In [24]:
import numpy as np

In [29]:
product1=np.array(["TV",10000,12000])
product2=np.array([ "Mobile",30000,32000])
product3=np.array(["Laptop",50000,48000])
product4=np.array(["Tablet",40000,45000])
df1=pd.DataFrame([product1,product2,product3,product4],columns=['product_details','jan_sales','feb_sales'])
print(df1)

  product_details jan_sales feb_sales
0              TV     10000     12000
1          Mobile     30000     32000
2          Laptop     50000     48000
3          Tablet     40000     45000


## Dataframe slicing

# Accessing single values

    at method → Using .at[]

    iat method →Using .iat[]

# Accessing ranges

    slicing → Using slicing [start:stop]

# at method

In [30]:
print(df1.at[1,'product_details'])

Mobile


# iat method

In [31]:
print(df1.iat[1,0])

Mobile


# accessing ranges using normal slicing

In [32]:
print(df1[1:3])

  product_details jan_sales feb_sales
1          Mobile     30000     32000
2          Laptop     50000     48000


# set index method

In [34]:
product2={'product':['TV','Mobile','Laptop','Tablet'],
         'jan_sales':[10000,30000,50000,40000],
         'feb_sales':[20000,15000,45000,30000],
         'mar_sales':[14000,26000,80000,70000]
}
df2=pd.DataFrame(product2)
print(df2)

  product  jan_sales  feb_sales  mar_sales
0      TV      10000      20000      14000
1  Mobile      30000      15000      26000
2  Laptop      50000      45000      80000
3  Tablet      40000      30000      70000


In [35]:
df2.set_index("product", inplace=True)

In [36]:
print(df2)

         jan_sales  feb_sales  mar_sales
product                                 
TV           10000      20000      14000
Mobile       30000      15000      26000
Laptop       50000      45000      80000
Tablet       40000      30000      70000


In [37]:
df2.reset_index(inplace=True)

In [38]:
print(df2)

  product  jan_sales  feb_sales  mar_sales
0      TV      10000      20000      14000
1  Mobile      30000      15000      26000
2  Laptop      50000      45000      80000
3  Tablet      40000      30000      70000


# Dataframe attributes

In [40]:
# Returns the number of dimensions 
print(df2.ndim)

2
product      object
jan_sales     int64
feb_sales     int64
mar_sales     int64
dtype: object


In [41]:
# Returns the data types of each column 
print(df2.dtypes)

product      object
jan_sales     int64
feb_sales     int64
mar_sales     int64
dtype: object


In [42]:
# Returns total number of elements (rows × columns)
print(df2.size)

16


In [43]:
# Returns a tuple (rows, columns)
print(df2.shape)

(4, 4)


In [44]:
# Returns the data 
print(df2.values)

[['TV' 10000 20000 14000]
 ['Mobile' 30000 15000 26000]
 ['Laptop' 50000 45000 80000]
 ['Tablet' 40000 30000 70000]]


In [45]:
# Returns True if DataFrame is empty
print(df2.empty)

False


In [46]:
# Checks if any missing (NaN) values are present 
print(df2.isna().values.any())

False


In [47]:
# Returns the row index labels
print(df2.index)

RangeIndex(start=0, stop=4, step=1)


In [48]:
# Returns the column labels
print(df2.columns)

Index(['product', 'jan_sales', 'feb_sales', 'mar_sales'], dtype='object')


In [49]:
# Returns both row and column labels
print(df2.axes)

[RangeIndex(start=0, stop=4, step=1), Index(['product', 'jan_sales', 'feb_sales', 'mar_sales'], dtype='object')]


In [50]:
# Returns the transposed version of the DataFrame
print(df2.T)

               0       1       2       3
product       TV  Mobile  Laptop  Tablet
jan_sales  10000   30000   50000   40000
feb_sales  20000   15000   45000   30000
mar_sales  14000   26000   80000   70000


In [51]:
# Returns the number of non-null entries in each column
print(df2.count())

product      4
jan_sales    4
feb_sales    4
mar_sales    4
dtype: int64


Summary
- A "DataFrame" is a 2-dimensional data structure
- It used to store and manipulate tabular data.
- It can store any datatype ( eg : int,float, string)
- It has numeric data labels by default, but also supports " custom labels"