# Python | Pandas Dataframe/Series.head() method

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages.***Pandas*** is one of those packages and makes importing and analyzing data much easier.

Pandas **head()**  method is used to return top n (5 by default) rows of a data frame or series.

In the following examples, the data frame used contains data of some NBA players. The image of data frame before any operations is attached below.

![image.png](attachment:image.png)

# Example #1:
In this example, top 5 rows of data frame are returned and stored in a new variable. No parameter is passed to .head() method since by default it is 5.


In [2]:
#importing pandas as pandas
import pandas as pd

#import and make the dataframe
data = pd.read_csv('nba.csv')



In [None]:
#calling the head() method
#storing in new variable
data_top = data.head()

print(data_top)

# Output:
![image.png](attachment:image.png)


# Example #2: Calling on Series with n **parameter()**

In this example, the ***.head()*** method is called on series with custom input of n parameter to return top 9 rows of the series.

In [None]:
#import pandas
import pandas as pd

#making dataframe
data = pd.read_csv("nba.csv")

#number of no. of rows of return
n = 9
#creating series
series = data["Name"]

#returning top n rows
top = series.head(n = n)

#display
top

# Output:
![image.png](attachment:image.png)


# Python | Pandas Dataframe/Series.tail() method

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas **tail()** method is used to return bottom n (5 by default) rows of a data frame or series.

# Example #1:
In this example, bottom 5 rows of data frame are returned and stored in a new variable. No parameter is passed to **.tail()** method since by default it is 5.



In [5]:
#import pandas 
import pandas as pd

#maing data frame
data = pd.read_csv("nba.csv")

#calling tail() method
data_bottom = data.tail()

#display
data_bottom

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
453,Shelvin Mack,Utah Jazz,8.0,PG,26.0,6-3,203.0,Butler,2433333.0
454,Raul Neto,Utah Jazz,25.0,PG,24.0,6-1,179.0,,900000.0
455,Tibor Pleiss,Utah Jazz,21.0,C,26.0,7-3,256.0,,2900000.0
456,Jeff Withey,Utah Jazz,24.0,C,26.0,7-0,231.0,Kansas,947276.0
457,,,,,,,,,


# Pandas DataFrame describe() Method
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. ***Pandas*** is one of those packages and makes importing and analyzing data much easier. 

# Pandas DataFrame describe()
Pandas **describe()** is used to view some basic statistical details like percentile, mean, std, etc. of a data frame or a series of numeric values. When this method is applied to a series of strings, it returns a different output which is shown in the examples below.

* **percentile**: list like data type of numbers between 0-1 to return the respective percentile 
* **include**: List of data types to be included while describing dataframe. Default is None 
* **exclude**: List of data types to be Excluded while describing dataframe. Default is None 
* **Return type**: Statistical summary of data frame.



In [6]:
#import pandas
import pandas as pd

#reading and printing csv file

data = pd.read_csv('nba.csv')
print(data.head())

            Name            Team  Number Position   Age Height  Weight  \
0  Avery Bradley  Boston Celtics     0.0       PG  25.0    6-2   180.0   
1    Jae Crowder  Boston Celtics    99.0       SF  25.0    6-6   235.0   
2   John Holland  Boston Celtics    30.0       SG  27.0    6-5   205.0   
3    R.J. Hunter  Boston Celtics    28.0       SG  22.0    6-5   185.0   
4  Jonas Jerebko  Boston Celtics     8.0       PF  29.0   6-10   231.0   

             College     Salary  
0              Texas  7730337.0  
1          Marquette  6796117.0  
2  Boston University        NaN  
3      Georgia State  1148640.0  
4                NaN  5000000.0  


# Using Describe function in Pandas
We can easily learn about several statistical measures, including mean, median, standard deviation, quartiles, and more, by using **describe()** on a DataFrame.

In [7]:
print(data.describe())

           Number         Age      Weight        Salary
count  457.000000  457.000000  457.000000  4.460000e+02
mean    17.678337   26.938731  221.522976  4.842684e+06
std     15.966090    4.404016   26.368343  5.229238e+06
min      0.000000   19.000000  161.000000  3.088800e+04
25%      5.000000   24.000000  200.000000  1.044792e+06
50%     13.000000   26.000000  220.000000  2.839073e+06
75%     25.000000   30.000000  240.000000  6.500000e+06
max     99.000000   40.000000  307.000000  2.500000e+07


# Explanation of the description of numerical columns:


* **count**: Total Number of Non-Empty values
* **mean**: Mean of the column values
* **std**: Standard Deviation of the column values
* **min**: Minimum value from the column
* **25%**: 25 percentile
* **50%**: 50 percentile
* **75%**: 75 percentile
* **max**: Maximum value from the column

# Pandas ***describe()*** behavior for numeric dtypes
In this example, the data frame is described and [‘object’] is passed to include a parameter to see a description of the object series. [.20, .40, .60, .80] is passed to the percentile parameter to view the respective percentile of the Numeric series. 



In [8]:
import pandas as pd
data = pd.read_csv("nba.csv")

#remove null values to avoid errors
data.dropna(inplace=True)

#percentile list
perc = [.20, .40, .60, .80]

# list of dtypes to include
include = ['object', 'float', 'int']

#calling describe method
desc = data.describe(percentiles=perc, include=include)

#display
desc

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
count,364,364,364.0,364,364.0,364,364.0,364,364.0
unique,364,30,,5,,17,,115,
top,Avery Bradley,New Orleans Pelicans,,SG,,6-9,,Kentucky,
freq,1,16,,87,,49,,22,
mean,,,16.82967,,26.615385,,219.785714,,4620311.0
std,,,14.994162,,4.233591,,24.793099,,5119716.0
min,,,0.0,,19.0,,161.0,,55722.0
20%,,,4.0,,23.0,,195.0,,947276.0
40%,,,9.0,,25.0,,212.0,,1638754.0
50%,,,12.0,,26.0,,220.0,,2515440.0


# Describing series of strings 
In this example, the described method is called by the Name column to see the behavior with the object data type. 

In [9]:
#import pandas as pd
import pandas as pd

#making data frame
data = pd.read_csv("nba.csv")

# removing all null values to avoid errors
data.dropna(inplace=True)

#calling describe method
desc = data["Name"].describe()

#display
desc

count               364
unique              364
top       Avery Bradley
freq                  1
Name: Name, dtype: object

In [24]:
from itertools import product

A = list(map(int, input().split()))
B = list(map(int, input().split()))

cartesian = product(A, B)

formatted_output = ' '.join(f"({x}, {y})" for x, y in cartesian)
print(formatted_output)

(1, 3) (1, 4) (2, 3) (2, 4)


In [5]:
nums = [-4,-1,0,3,10]
squared_nums = [x **2 for x in nums]
n = len(squared_nums)
for i in range(n-1):
    for j in range(n-i-1):
        if squared_nums[j] > squared_nums[j+1]:
            squared_nums[j], squared_nums[j+1] = squared_nums[j+1], squared_nums[j]

print("Sorted array:", squared_nums)

Sorted array: [0, 1, 9, 16, 100]


In [7]:
def merge(nums1, m, nums2, n):
    # # Start from the end of nums1 and nums2
    i = m - 1  # Pointer for the last element in the initial part of nums1
    j = n - 1  # Pointer for the last element in nums2
    k = m + n - 1  # Pointer for the last position in nums1

    # While there are elements to be compared in nums1 and nums2
    while i >= 0 and j >= 0:
        if nums1[i] > nums2[j]:
            nums1[k] = nums1[i]
            i -= 1
        else:
            nums1[k] = nums2[j]
            j -= 1
        k -= 1

    # If there are remaining elements in nums2, copy them
    while j >= 0:
        nums1[k] = nums2[j]
        j -= 1
        k -= 1
    # for i in nums1:
    #         for j in nums2:
    #             if i == 0:
    #                 nums1[i] = j
           

# Example 1
nums1 = [1, 2, 3, 0, 0, 0]
m = 3
nums2 = [2, 5, 6]
n = 3
merge(nums1, m, nums2, n)
print("Output:", nums1)  # Output: [1, 2, 2, 3, 5, 6]

# Example 2
nums1 = [1]
m = 1
nums2 = []
n = 0
merge(nums1, m, nums2, n)
print("Output:", nums1)  # Output: [1]

# Example 3
nums1 = [0]
m = 0
nums2 = [1]
n = 1
merge(nums1, m, nums2, n)
print("Output:", nums1)  # Output: [1]


Output: [1, 2, 2, 3, 5, 6]
Output: [1]
Output: [1]
