# Pandas

- Pandas is an open-source Python library designed for data manipulation and analysis.
-  It provides high-performance, easy-to-use data structures and data analysis tools that make working with structured data (like tables) intuitive and efficient.
-  the most widely used libraries in data analysis and data science.
-  Its capabilities extend far beyond basic operations, making it a robust tool for handling and processing data efficiently. Below is a deeper dive into Pandas and its functionalities.



### Key Features

- Data Cleaning: Handling missing data, removing duplicates, and correcting inconsistencies.
- Data Selection: Extracting rows and columns easily.
- Data Transformation: Applying functions, filtering, and grouping data.
- Data Aggregation: Calculating sums, averages, counts, and other statistics.
- File I/O: Reading and writing data to/from formats like CSV, Excel, SQL, and JSON.



### Why Use Pandas?

- It simplifies complex data operations and reduces the amount of code needed.
- It's highly compatible with other libraries, such as NumPy, Matplotlib, and scikit-learn, for tasks like numerical computation, visualization, and machine learning.

### Instalaion of pandas

- By using ! pip install pandas  it is already present in our jupiter notebook

In [15]:
! pip install pandas



### Importing pandas

- By using import pandas as pd we can import pandas

In [9]:
import  pandas as pd

### Uploading csv file.

- Uploading a CSV file in Pandas is a straightforward process. The library provides the read_csv() function to load data from a CSV file into a Pandas DataFrame for analysis.
- Use the read_csv() function to read the CSV file and load it into a DataFrame

In [11]:
df = pd.read_csv(r"C:\Users\Pasha\Downloads\archive.zip")
df  # to get all data in file we uploaded

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
0,0,A A Hospital,Tamilnadu,Hosur,Denkani Kottai Road Mathigiri,635110.0
1,1,A C Hospital,Tamilnadu,Chennai,"No-8, 3Rd Main Road, United India Nagar, Ayana...",600023.0
2,2,A-Care Orthopaedic & General Hospital,Maharashtra,Mumbai,"G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhaya...",401107.0
3,3,A-One Hospital Pvt Ltd,Delhi,Delhi,"A-1/7, Panchim Vihar, Rohtak Road, New Delhi-1...",110063.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
...,...,...,...,...,...,...
1343,1343,Dr. Mahesh Rathod Hospital,Gujarat,Rajkot,"1, Sardarnagar, Nr. Astran Chowk, Rajkot",360001.0
1344,1344,Dr. Malathi Manipal Hospital,Karnataka,Bangalore,"45/1, 45Th Cross, 9Th Block, Jayanagar, Bangal...",560069.0
1345,1345,Dr. Manish Patankars Nath Hospital,Maharashtra,Mumbai,"101, A-Wing, Golden Havens Apartments, Kolbad ...",400601.0
1346,1346,Dr. Manohara Sai Gowda Memorial Hospital,Karnataka,Mulbagal,"Nh-4, Opposite Inspection Bungalow, Mulbagal",563131.0


### About Head()
- The head() method is a simple yet powerful tool for data inspection.
- Default behavior shows the first 5 rows, but this can be adjusted with the n parameter.
- Frequently used in data exploration and debugging.

In [13]:
# Viewing of head()
df.head(10)

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
0,0,A A Hospital,Tamilnadu,Hosur,Denkani Kottai Road Mathigiri,635110.0
1,1,A C Hospital,Tamilnadu,Chennai,"No-8, 3Rd Main Road, United India Nagar, Ayana...",600023.0
2,2,A-Care Orthopaedic & General Hospital,Maharashtra,Mumbai,"G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhaya...",401107.0
3,3,A-One Hospital Pvt Ltd,Delhi,Delhi,"A-1/7, Panchim Vihar, Rohtak Road, New Delhi-1...",110063.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
5,5,A.C. Hospital,Tamilnadu,Salem,"201,2Nd Agraharam",636001.0
6,6,A.G. Eye Care Hospitals (U Nit Of Dr. A.G. Eye...,Tamilnadu,Chennai,"No. 106, R.K. Mutt Road. Mylapore, Chennai, Ta...",600004.0
7,7,A.G. Hospital,Tamilnadu,Coimbatore,"34, Kpn Colony 3Rd Street, Tirupur",641601.0
8,8,A.G. Padmavati'S Hospital Ltd.,Tamilnadu,Pondicherry,"R.S. No.127/A, Villianur Main Road, Arumpartha...",605110.0
9,9,A.J. Hospital,Kerala,Trivandrum,"Kazhakuttam, P.O., Thiruvananthapuram, Trivand...",695582.0


### About Tail()
- In Pandas, the tail() method is used to view the last few rows of a DataFrame or Series.
- It's especially helpful when you want to inspect the end of your dataset, such as checking trailing data or verifying the structure of the last rows.

In [15]:
# Viewing of tail()
df.tail(10)

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
1338,1338,Dr. Kulkarni Hospital,Maharashtra,Pune,"10A 11 Kalyan Peth,Junnar",410502.0
1339,1339,Dr. Kumaraswami Health Centre ( A Unit Of Indi...,Tamilnadu,Kanyakumari,"N.H. Road # 47, 1/4C,1/4D,1/4E Perumalpuramper...",629703.0
1340,1340,Dr. Kunhalu'S Nursing Home,Kerala,Ernakulam,"T.D.Road, Ernakulamernakulam",682011.0
1341,1341,Dr. Loganathan Orthopaedic Hospital,Tamilnadu,Erode,"606/A, Mettur Main Road, Bhavani,",638301.0
1342,1342,Dr. M.L.Gupta Memorial Centre,Haryana,Faridabad,"B.P. 5E/4, Railway Road, Nit,Nit - Faridabad,",121001.0
1343,1343,Dr. Mahesh Rathod Hospital,Gujarat,Rajkot,"1, Sardarnagar, Nr. Astran Chowk, Rajkot",360001.0
1344,1344,Dr. Malathi Manipal Hospital,Karnataka,Bangalore,"45/1, 45Th Cross, 9Th Block, Jayanagar, Bangal...",560069.0
1345,1345,Dr. Manish Patankars Nath Hospital,Maharashtra,Mumbai,"101, A-Wing, Golden Havens Apartments, Kolbad ...",400601.0
1346,1346,Dr. Manohara Sai Gowda Memorial Hospital,Karnataka,Mulbagal,"Nh-4, Opposite Inspection Bungalow, Mulbagal",563131.0
1347,1347,Dr. Mehta'S Multispeciality Hospital Pvt Ltd.,Tamilnadu,Chennai,"2 (E), Mc. Nicholas Road, Chennai, Opposite To...",600031.0


### About Describe()
- the describe() method is used to generate summary statistics of a DataFrame or Series. This method provides key insights into the data, including measures like count, mean, standard deviation, minimum, maximum, and various percentiles.
- it removes all chatageries(strings).



In [17]:
# describe it use to print only numarics
df.describe()

Unnamed: 0.1,Unnamed: 0,Pincode
count,1348.0,1347.0
mean,673.5,456418.442465
std,389.278392,174831.912834
min,0.0,110002.0
25%,336.75,380061.0
50%,673.5,500001.0
75%,1010.25,584115.0
max,1347.0,834009.0


# About info()
### Index Range
- The range of indices in the DataFrame (e.g., RangeIndex: 5 entries, 0 to 4).
### Column Summary
- Lists each column by name.
- Shows the number of non-null values in each column (Non-Null Count).
- Displays the data type of each column (Dtype).
### Data Types
- Common types include int64, float64, object (strings), datetime64, and bool.
### Memory Usage
- Displays the approximate memory consumption of the DataFrame.


In [19]:
# info is to look at data type ,missing values and data size and data frame
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1348 entries, 0 to 1347
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Unnamed: 0    1348 non-null   int64  
 1   Hospital      1348 non-null   object 
 2   State         1348 non-null   object 
 3   City          1348 non-null   object 
 4   LocalAddress  1348 non-null   object 
 5   Pincode       1347 non-null   float64
dtypes: float64(1), int64(1), object(4)
memory usage: 63.3+ KB


In [21]:
print(df.shape) # total rows and columns
print(df.shape[0]) # no of rows
print(df.shape[1]) # no of columns

(1348, 6)
1348
6


### About column
- columns represent the vertical axis of a DataFrame, and each column corresponds to a single variable or feature in the dataset.
- Columns can store different types of data, such as numeric, textual, datetime, or boolean.




In [23]:
# Column  is used to get only column names
df.columns

Index(['Unnamed: 0', 'Hospital', 'State', 'City', 'LocalAddress', 'Pincode'], dtype='object')

In [25]:
list(df.columns)

['Unnamed: 0', 'Hospital', 'State', 'City', 'LocalAddress', 'Pincode']

### About nlargest
- the nlargest() method is used to retrieve the n largest values from a DataFrame or Series based on a specific column.
- This method is particularly useful when you want to quickly identify the top n records according to a certain variable (e.g., top 5 highest sales, top 3 highest scores, etc.).

In [27]:
df.nlargest(5,'Pincode')

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
195,195,Alam Hospital & Research Centre,Jharkhand,Ranchi,"Booty Road, Bariatu.",834009.0
469,469,Asarfi Hospital Pvt Ltd,Jharkhand,Dhanbad,"At Baramuri Po,Bishanpur,Dhanbad",828130.0
72,72,Abhinandan Hospital,Bihar,Patna,"0/60, Doctors Colony, Kankar Baga",800020.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
464,464,Arvind Hospital Pvt Ltd,Bihar,Patna,"Ashok Rajpath,Opp.Pmch,",800004.0


### About isnull()
- the isnull() method is used to detect missing or null values in a DataFrame or Series.
- It returns a boolean DataFrame or Series of the same shape, where True indicates that the corresponding value is null (i.e., NaN, None, or missing), and False indicates that the value is not null.

In [29]:
# isnull is used to find missing values
df.isnull()

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
0,False,False,False,False,False,False
1,False,False,False,False,False,False
2,False,False,False,False,False,False
3,False,False,False,False,False,False
4,False,False,False,False,False,False
...,...,...,...,...,...,...
1343,False,False,False,False,False,False
1344,False,False,False,False,False,False
1345,False,False,False,False,False,False
1346,False,False,False,False,False,False


### About isnull().any()
- to identify the any null values present in any column
- the null is presented it shows true.

In [31]:
# isnull().any() to find which column contain null values 
df.isnull().any()

Unnamed: 0      False
Hospital        False
State           False
City            False
LocalAddress    False
Pincode          True
dtype: bool

In [33]:
# to find no of null values in data 
df.isnull().sum()

Unnamed: 0      0
Hospital        0
State           0
City            0
LocalAddress    0
Pincode         1
dtype: int64

In [35]:
# to find the null value position 
df[df.isnull().any(axis=1)]

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
1139,1139,Deep Nursing Home,Punjab,Barnala,Collage Road Barnala-148101,


In [37]:
# to isolate one column using []
df['Pincode'].head(5)

0    635110.0
1    600023.0
2    401107.0
3    110063.0
4    800016.0
Name: Pincode, dtype: float64

In [170]:
# isolate two or more values in columns using [[ ]]
df[['Pincode','City']].head(10)

Unnamed: 0,Pincode,City
0,635110.0,Hosur
1,600023.0,Chennai
2,401107.0,Mumbai
3,110063.0,Delhi
4,800016.0,Patna
5,636001.0,Salem
6,600004.0,Chennai
7,641601.0,Coimbatore
8,605110.0,Pondicherry
9,695582.0,Trivandrum


In [39]:
# Accessing any row using []
df[df.index==5]

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
5,5,A.C. Hospital,Tamilnadu,Salem,"201,2Nd Agraharam",636001.0


In [41]:
# finding loacation
df.loc[5]

Unnamed: 0                      5
Hospital            A.C. Hospital
State                   Tamilnadu
City                        Salem
LocalAddress    201,2Nd Agraharam
Pincode                  636001.0
Name: 5, dtype: object

In [43]:
#isolating two or more rows by using .isin()
df[df.index.isin(range(1,10))]

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
1,1,A C Hospital,Tamilnadu,Chennai,"No-8, 3Rd Main Road, United India Nagar, Ayana...",600023.0
2,2,A-Care Orthopaedic & General Hospital,Maharashtra,Mumbai,"G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhaya...",401107.0
3,3,A-One Hospital Pvt Ltd,Delhi,Delhi,"A-1/7, Panchim Vihar, Rohtak Road, New Delhi-1...",110063.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
5,5,A.C. Hospital,Tamilnadu,Salem,"201,2Nd Agraharam",636001.0
6,6,A.G. Eye Care Hospitals (U Nit Of Dr. A.G. Eye...,Tamilnadu,Chennai,"No. 106, R.K. Mutt Road. Mylapore, Chennai, Ta...",600004.0
7,7,A.G. Hospital,Tamilnadu,Coimbatore,"34, Kpn Colony 3Rd Street, Tirupur",641601.0
8,8,A.G. Padmavati'S Hospital Ltd.,Tamilnadu,Pondicherry,"R.S. No.127/A, Villianur Main Road, Arumpartha...",605110.0
9,9,A.J. Hospital,Kerala,Trivandrum,"Kazhakuttam, P.O., Thiruvananthapuram, Trivand...",695582.0


In [45]:
# converting data into array by using  "to"
df.to_numpy()

array([[0, 'A A Hospital', 'Tamilnadu', 'Hosur',
        'Denkani Kottai Road Mathigiri', 635110.0],
       [1, 'A C Hospital', 'Tamilnadu', 'Chennai',
        'No-8, 3Rd Main Road, United India Nagar, Ayanavaram, Chennai-600023',
        600023.0],
       [2, 'A-Care Orthopaedic & General Hospital', 'Maharashtra',
        'Mumbai',
        'G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhayander Road, Mira Road.',
        401107.0],
       ...,
       [1345, 'Dr. Manish Patankars Nath Hospital', 'Maharashtra',
        'Mumbai',
        '101, A-Wing, Golden Havens Apartments, Kolbad Road, Khopat, Thana West-400601',
        400601.0],
       [1346, 'Dr. Manohara Sai Gowda Memorial Hospital', 'Karnataka',
        'Mulbagal', 'Nh-4, Opposite Inspection Bungalow, Mulbagal',
        563131.0],
       [1347, "Dr. Mehta'S Multispeciality Hospital Pvt Ltd.",
        'Tamilnadu', 'Chennai',
        '2 (E), Mc. Nicholas Road, Chennai, Opposite To Chetpet Police Station.',
        600031.0]], dtype=o

In [59]:
data=df
a=data.copy()
#(data)
print(a)

      Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

           City                        

## Dealing with missing data techniques
### Droping missing values


In [61]:
a=data.dropna() # is use to remove null values by using this we cam
print(a)

      Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

           City                        

In [71]:
a.shape

(1347, 6)

In [73]:
a.isnull().sum() # finding sum of null values after using dropna()

Unnamed: 0      0
Hospital        0
State           0
City            0
LocalAddress    0
Pincode         0
dtype: int64

In [75]:
a.dropna(inplace=True,axis=0)
print(a.isnull().sum())

Unnamed: 0      0
Hospital        0
State           0
City            0
LocalAddress    0
Pincode         0
dtype: int64


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  a.dropna(inplace=True,axis=0)


### Replacing missing values

In [77]:
mean_value=a['Pincode'].mean() #calculating mean value by the pincode column
print(mean_value)

456418.44246473646


In [79]:
b = df.fillna(mean_value)
print(df)

      Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

           City                        

In [81]:
b.shape

(1348, 6)

In [83]:
df[df.index==1139]

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
1139,1139,Deep Nursing Home,Punjab,Barnala,Collage Road Barnala-148101,


In [217]:
median_value= data['Pincode'].median()
print(median_value)

500001.0


In [219]:
data.fillna(median_value)

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
0,0,A A Hospital,Tamilnadu,Hosur,Denkani Kottai Road Mathigiri,635110.0
1,1,A C Hospital,Tamilnadu,Chennai,"No-8, 3Rd Main Road, United India Nagar, Ayana...",600023.0
2,2,A-Care Orthopaedic & General Hospital,Maharashtra,Mumbai,"G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhaya...",401107.0
3,3,A-One Hospital Pvt Ltd,Delhi,Delhi,"A-1/7, Panchim Vihar, Rohtak Road, New Delhi-1...",110063.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
...,...,...,...,...,...,...
1343,1343,Dr. Mahesh Rathod Hospital,Gujarat,Rajkot,"1, Sardarnagar, Nr. Astran Chowk, Rajkot",360001.0
1344,1344,Dr. Malathi Manipal Hospital,Karnataka,Bangalore,"45/1, 45Th Cross, 9Th Block, Jayanagar, Bangal...",560069.0
1345,1345,Dr. Manish Patankars Nath Hospital,Maharashtra,Mumbai,"101, A-Wing, Golden Havens Apartments, Kolbad ...",400601.0
1346,1346,Dr. Manohara Sai Gowda Memorial Hospital,Karnataka,Mulbagal,"Nh-4, Opposite Inspection Bungalow, Mulbagal",563131.0


In [85]:
data.shape

(1348, 6)

## Duplicate data
- it removes all the duplicates in data
- by this we can loss data

In [87]:
c = data.drop_duplicates()
print(c)

      Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

           City                        

In [89]:
c.shape


(1348, 6)

## Renaming the columnn name

In [91]:
data.rename(columns={'Pincode':'pin num'})

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,pin num
0,0,A A Hospital,Tamilnadu,Hosur,Denkani Kottai Road Mathigiri,635110.0
1,1,A C Hospital,Tamilnadu,Chennai,"No-8, 3Rd Main Road, United India Nagar, Ayana...",600023.0
2,2,A-Care Orthopaedic & General Hospital,Maharashtra,Mumbai,"G-1, Giriraj Tower, Sai Baba Nagar, Mira Bhaya...",401107.0
3,3,A-One Hospital Pvt Ltd,Delhi,Delhi,"A-1/7, Panchim Vihar, Rohtak Road, New Delhi-1...",110063.0
4,4,A.B.Eye Institute,Bihar,Patna,"Road No.12, Rajendra Nagar, Patna, Bihar-800016",800016.0
...,...,...,...,...,...,...
1343,1343,Dr. Mahesh Rathod Hospital,Gujarat,Rajkot,"1, Sardarnagar, Nr. Astran Chowk, Rajkot",360001.0
1344,1344,Dr. Malathi Manipal Hospital,Karnataka,Bangalore,"45/1, 45Th Cross, 9Th Block, Jayanagar, Bangal...",560069.0
1345,1345,Dr. Manish Patankars Nath Hospital,Maharashtra,Mumbai,"101, A-Wing, Golden Havens Apartments, Kolbad ...",400601.0
1346,1346,Dr. Manohara Sai Gowda Memorial Hospital,Karnataka,Mulbagal,"Nh-4, Opposite Inspection Bungalow, Mulbagal",563131.0


### Data Analysis in pandas

In [93]:
data.mean

<bound method DataFrame.mean of       Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

       

In [95]:
data.mode()

Unnamed: 0.1,Unnamed: 0,Hospital,State,City,LocalAddress,Pincode
0,0,Balaji Hospital,Maharashtra,Mumbai,"Dargamitta,Nellore",143001.0
1,1,,,,Indapur Road,
2,2,,,,,
3,3,,,,,
4,4,,,,,
...,...,...,...,...,...,...
1343,1343,,,,,
1344,1344,,,,,
1345,1345,,,,,
1346,1346,,,,,


In [361]:
data.median

<bound method DataFrame.median of       Unnamed: 0                                       Hospital        State  \
0              0                                   A A Hospital    Tamilnadu   
1              1                                   A C Hospital    Tamilnadu   
2              2          A-Care Orthopaedic & General Hospital  Maharashtra   
3              3                         A-One Hospital Pvt Ltd        Delhi   
4              4                              A.B.Eye Institute        Bihar   
...          ...                                            ...          ...   
1343        1343                     Dr. Mahesh Rathod Hospital      Gujarat   
1344        1344                   Dr. Malathi Manipal Hospital    Karnataka   
1345        1345             Dr. Manish Patankars Nath Hospital  Maharashtra   
1346        1346       Dr. Manohara Sai Gowda Memorial Hospital    Karnataka   
1347        1347  Dr. Mehta'S Multispeciality Hospital Pvt Ltd.    Tamilnadu   

     

In [124]:
#value_counts() is used to represent repited numbers in columns
data['Pincode'].value_counts()

Pincode
143001.0    9
122001.0    8
572102.0    8
522001.0    8
380007.0    7
           ..
500062.0    1
400101.0    1
441001.0    1
424101.0    1
600031.0    1
Name: count, Length: 843, dtype: int64

###  Aggregating data with .groupby()

- Aggregtaing means sum or Average

data.groupby("Pincode").median().head(10)
