Pandas is a powerful Python package that is specifically designed to work with "relational" data. 
This package works well for data manipulation, data analysis, and even machine learning-related tasks. 
We usually use it importing it like "import pandas as pd". 

Pandas simplifies many task related to realtaional data such as:
Import datasets - available in the form of spreadsheets, comma-separated values (CSV) files, and more.
Data validation and cleansing - dealing with missing values and replacing them as NaN or NA.
Data normalization – normalize the data into a suitable format for analysis.
Intuitive merging and joining data sets – datasets can be merged and join.
Pivoting of datasets – datasets can be reshaped and pivoted as we need.
Statistical analysis - statistical operations on datasets are available.

Generally, Pandas operates as aobjects: data Series and/or  DataFrame. 
Series works on a one-dimensional labeled array holding data of any type like integers, strings, and objects.
A DataFrame is a two-dimensional data structure that contains and operates data in a tabular form, using named rows and columns.

DataFrame
A DataFrame is similar to a table in a database or a spreadsheet. 
Consider a table, represented a sales team:

Name	Age	Gender	Rating
Steve	32	Male	3.45
Lia	28	Female	4.6
Vin	45	Male	3.9
Katie	38	Female	2.78


In [1]:
# create a pandas dataframe and populate with our table's data  
import pandas as pd

# table as a dictionary
data = {
    'Name': ['Steve', 'Lia', 'Vic', 'Katlyn'],
    'Age': [42, 38, 35, 38],
    'Gender': ['Male', 'Female', 'Male', 'Female'],
    'Rating': [3.45, 4.6, 3.9, 2.78]
}

# create the DataFrame
df = pd.DataFrame(data)

# display the DataFrame
print(df)

     Name  Age  Gender  Rating
0   Steve   42    Male    3.45
1     Lia   38  Female    4.60
2     Vic   35    Male    3.90
3  Katlyn   38  Female    2.78


In [2]:
# display a selected data within a DataFrame
print(df['Name'])

0     Steve
1       Lia
2       Vic
3    Katlyn
Name: Name, dtype: object


Panadas DataFrame has the following attributes:
1)	dtype: returns the data type of the elements in the Series or DataFrame
2)	index: provides the index (row labels) of the Series or DataFrame
3)	values: returns the data in the Series or DataFrame as a NumPy array
4)	shape: returns a tuple representing the dimensionality of the DataFrame (rows, columns)
5)	ndim: returns the number of dimensions of the object. Series is always 1D, and DataFrame is 2D
6)	size: gives the total number of elements in the object
7)	empty: checks if the object is empty, and returns True if it is
8)	columns: provides the column labels of the DataFrame object
  

In [6]:
# working of a DataFrame attributes

import numpy as np

# create a DataFrame, populated with random numbers
df_numbers = pd.DataFrame(np.random.randn(8, 4), columns=list('ABCD'))

# display DataFrame 
print("DataFrame:\n", df_numbers)

DataFrame:
           A         B         C         D
0 -0.178706 -0.960766  0.436961  0.536331
1 -0.951409  0.401667 -0.554034  0.986464
2  0.432042 -1.842698  0.516388 -0.350617
3  0.287110  1.805995  0.332869 -0.409164
4  0.275934 -0.797893 -0.822139 -1.489373
5  2.141145 -0.367153  1.262124  0.691418
6  0.953803  0.767487 -0.212739 -0.146108
7  0.119920 -0.138867  0.869964 -0.092097


In [4]:
# attributes output
print("DataFrame Attributes:")
print("Data types:", df.dtypes)
print("Index:", df.index)
print("Columns:", df.columns)
print("Values:")
print(df.values)
print("Shape:", df.shape)
print("Number of dimensions:", df.ndim)
print("Size:", df.size)
print("Is empty:", df.empty)

DataFrame Attributes:
Data types: Name       object
Age         int64
Gender     object
Rating    float64
dtype: object
Index: RangeIndex(start=0, stop=4, step=1)
Columns: Index(['Name', 'Age', 'Gender', 'Rating'], dtype='object')
Values:
[['Steve' 42 'Male' 3.45]
 ['Lia' 38 'Female' 4.6]
 ['Vic' 35 'Male' 3.9]
 ['Katlyn' 38 'Female' 2.78]]
Shape: (4, 4)
Number of dimensions: 2
Size: 16
Is empty: False


Pandas offers several basic methods that makes it easy to quickly look at and understand the data inside: 
1) head(n) - Returns the first n rows of the object. The default value of n is 5.
2) tail(n) - Returns the last n rows of the object. The default value of n is 5.
3) info() - Provides a concise summary of a DataFrame, including the index dtype and column dtypes, non-null values, and memory usage.
4) describe() - Generates descriptive statistics of the DataFrame or Series, such as count, mean, std, min, and max.

Let's create a DataFrame and see the working of these methods.


In [7]:
# print the team DataFrame
print("Team Data:\n", df)

# basic methods
print("\nInfo of the DataFrame:")
df.info()
print("\nFirst 5 rows of the DataFrame:\n", df.head())
print("\nLast 3 rows of the DataFrame:\n", df.tail(3))
print("\nDescriptive Statistics of the DataFrame:\n", df.describe())

# print numbers DataFrame
print("\nNumbers:\n", df_numbers)

# basic methods
print("\nInfo of the DataFrame:")
df_numbers.info()
print("\nFirst 5 rows of the DataFrame:\n", df_numbers.head())
print("\nLast 3 rows of the DataFrame:\n", df_numbers.tail(3))
print("\nDescriptive Statistics of the DataFrame:\n", df_numbers.describe())


Team Data:
      Name  Age  Gender  Rating
0   Steve   42    Male    3.45
1     Lia   38  Female    4.60
2     Vic   35    Male    3.90
3  Katlyn   38  Female    2.78

Info of the DataFrame:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Name    4 non-null      object 
 1   Age     4 non-null      int64  
 2   Gender  4 non-null      object 
 3   Rating  4 non-null      float64
dtypes: float64(1), int64(1), object(2)
memory usage: 260.0+ bytes

First 5 rows of the DataFrame:
      Name  Age  Gender  Rating
0   Steve   42    Male    3.45
1     Lia   38  Female    4.60
2     Vic   35    Male    3.90
3  Katlyn   38  Female    2.78

Last 3 rows of the DataFrame:
      Name  Age  Gender  Rating
1     Lia   38  Female    4.60
2     Vic   35    Male    3.90
3  Katlyn   38  Female    2.78

Descriptive statistics of the DataFrame:
              Age    Rating
count 

In [9]:
# sort a DatFrame, using columns
# create insorted DataFrame
unsorted_df = pd.DataFrame({'col1':[2,4,8,0,16,42,-42,13,21],'col2':[1,6,2,4,5,3,9,8,7]})
print("Unsorted DataFrame:\n", unsorted_df)

# sort the DataFrame by values from col1
sorted_df = unsorted_df.sort_values(by='col1')
print("\nSorted DataFrame:\n", sorted_df)


Unsorted DataFrame:
    col1  col2
0     2     1
1     9     3
2     5     2
3     0     4
4    11     5
5    42     6
6   -42     7
7    13     8
8    21     9

Sorted DataFrame:
    col1  col2
6   -42     7
3     0     4
0     2     1
2     5     2
1     9     3
4    11     5
7    13     8
8    21     9
5    42     6
