![logo.jpg](attachment:logo.jpg)

![image.png](attachment:image.png)

# Pandas
- Open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures

- Pandas is a newer package built on top of NumPy

- Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data

- Before pandas, python was used for data munging and Preparation.(Very Little contribution towards Data Analysis)

- Using Pandas, we can accomplish five typical steps in the processing and analysis of data, regardless of the origin of data — 
 1. Load 
 2. Prepare 
 3. Manipulate 
 4. Model
 5. Analyze.

# Advantages 

- Fast and efficient for manipulating and analyzing data. 
- Data from different file objects can be loaded. 
- Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data 
- Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects 
- Data set merging and joining. 
- Flexible reshaping and pivoting of data sets 
- Provides time-series functionality. 
- Powerful group by functionality for performing split-apply-combine operations on data sets. 

In [None]:
# getting Started
import pandas as pd

Pandas generally provide two data structures for manipulating data, They are: 

- Series 
- DataFrame 

# #

![logo.jpg](attachment:logo.jpg)

# Series and Dataframe

### Dataframe: 
- 2D size mutable, potentially heterogenous tabular dta structure with labeled axes(rows and columns)

- A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. 

- Pandas DataFrame consists of three principal components, the data, rows, and columns.

![finallpandas.png](attachment:finallpandas.png)

In [1]:
import pandas as pd

In [3]:
pd.DataFrame(['Jaiyesh','Petroleum from Scratch','Data'],columns=['Name'],index=['a','b','c'])

Unnamed: 0,Name
a,Jaiyesh
b,Petroleum from Scratch
c,Data


### Series: Each Column in a dataframe is a series

In [4]:
df = pd.DataFrame(['Jaiyesh','Petroleum from Scratch','Data'],columns=['Name'])

In [5]:
df

Unnamed: 0,Name
0,Jaiyesh
1,Petroleum from Scratch
2,Data


In [6]:
type(df)

pandas.core.frame.DataFrame

### When selecting a single column of a pandas DataFrame, the result is a pandas Series. To select the column, use the column label in between square brackets [     ].

In [7]:
df['Name']

0                   Jaiyesh
1    Petroleum from Scratch
2                      Data
Name: Name, dtype: object

In [8]:
type(df['Name'])

pandas.core.series.Series

In [9]:
# Creating series
porosity = pd.Series([0.2,0.3,0.4,0.12])

In [10]:
porosity

0    0.20
1    0.30
2    0.40
3    0.12
dtype: float64

In [11]:
porosity = pd.Series([0.2,0.3,0.4,0.12],name='poro')

In [12]:
porosity

0    0.20
1    0.30
2    0.40
3    0.12
Name: poro, dtype: float64

In [13]:
porosity.index = ['KG Basin','Mumbai High','Panna Mukta','Mangala']

In [14]:
porosity

KG Basin       0.20
Mumbai High    0.30
Panna Mukta    0.40
Mangala        0.12
Name: poro, dtype: float64

In [15]:
porosity = pd.Series([0.2,0.3,0.4,0.12],name='poro',index = ['KG Basin','Mumbai High','Panna Mukta','Mangala'])

In [16]:
porosity

KG Basin       0.20
Mumbai High    0.30
Panna Mukta    0.40
Mangala        0.12
Name: poro, dtype: float64

In [17]:
pordf = pd.DataFrame(porosity)

In [18]:
pordf

Unnamed: 0,poro
KG Basin,0.2
Mumbai High,0.3
Panna Mukta,0.4
Mangala,0.12


In [19]:
porosity.index

Index(['KG Basin', 'Mumbai High', 'Panna Mukta', 'Mangala'], dtype='object')

In [20]:
porosity.values

array([0.2 , 0.3 , 0.4 , 0.12])

In [21]:
dicpor = {'KG Basin':0.5,'Mumbai High':0.6, 'Panna Mukta':0.23, 'Mangala':0.32}

In [22]:
dicpor

{'KG Basin': 0.5, 'Mumbai High': 0.6, 'Panna Mukta': 0.23, 'Mangala': 0.32}

In [23]:
por = pd.Series(dicpor)

In [24]:
por

KG Basin       0.50
Mumbai High    0.60
Panna Mukta    0.23
Mangala        0.32
dtype: float64

In [25]:
por.min()

0.23

In [26]:
por.mean()

0.41250000000000003

In [27]:
por.max()

0.6

![logo.jpg](attachment:logo.jpg)

# Pandas

### Reading CSV files


In [10]:
import pandas as pd

In [12]:
df = pd.read_csv('vpd.csv')

In [13]:
type(df)

pandas.core.frame.DataFrame

In [20]:
df

Unnamed: 0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
0,07-Apr-14,7405,15/9-F-1 C,0.0,0.000,0.000,0.000,0.000,0.00000,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,WI
1,08-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,1.00306,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
2,09-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.97901,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
3,10-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.54576,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
4,11-Apr-14,7405,15/9-F-1 C,0.0,310.376,96.876,277.278,0.000,1.21599,%,33.098,10.480,33.072,0.0,0.0,0.0,,production,OP
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15629,14-Sep-16,5769,15/9-F-5,0.0,,,,0.273,0.63609,%,0.078,0.229,0.019,0.0,0.0,0.0,,production,OP
15630,15-Sep-16,5769,15/9-F-5,0.0,,,,0.287,0.67079,%,0.085,0.229,0.006,0.0,0.0,0.0,,production,OP
15631,16-Sep-16,5769,15/9-F-5,0.0,,,,0.286,0.66439,%,0.085,0.229,0.012,0.0,0.0,0.0,,production,OP
15632,17-Sep-16,5769,15/9-F-5,0.0,,,,0.272,0.62466,%,0.075,0.228,0.026,0.0,0.0,0.0,,production,OP


In [19]:
df.tail(3)

Unnamed: 0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
15631,16-Sep-16,5769,15/9-F-5,0.0,,,,0.286,0.66439,%,0.085,0.229,0.012,0.0,0.0,0.0,,production,OP
15632,17-Sep-16,5769,15/9-F-5,0.0,,,,0.272,0.62466,%,0.075,0.228,0.026,0.0,0.0,0.0,,production,OP
15633,18-Sep-16,5769,15/9-F-5,0.0,,,,,,,,,0.0,,,,0.0,injection,WI


In [24]:
#Index Column
pd.read_csv('vpd.csv',index_col='DATEPRD')

Unnamed: 0_level_0,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
DATEPRD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
07-Apr-14,7405,15/9-F-1 C,0.0,0.000,0.000,0.000,0.000,0.00000,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,WI
08-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,1.00306,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
09-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.97901,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
10-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.54576,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
11-Apr-14,7405,15/9-F-1 C,0.0,310.376,96.876,277.278,0.000,1.21599,%,33.098,10.480,33.072,0.0,0.0,0.0,,production,OP
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14-Sep-16,5769,15/9-F-5,0.0,,,,0.273,0.63609,%,0.078,0.229,0.019,0.0,0.0,0.0,,production,OP
15-Sep-16,5769,15/9-F-5,0.0,,,,0.287,0.67079,%,0.085,0.229,0.006,0.0,0.0,0.0,,production,OP
16-Sep-16,5769,15/9-F-5,0.0,,,,0.286,0.66439,%,0.085,0.229,0.012,0.0,0.0,0.0,,production,OP
17-Sep-16,5769,15/9-F-5,0.0,,,,0.272,0.62466,%,0.075,0.228,0.026,0.0,0.0,0.0,,production,OP


In [29]:
#Which row it encounter at 1st that is considered as header
pd.read_csv('vpd.csv',header=0)

Unnamed: 0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
0,07-Apr-14,7405,15/9-F-1 C,0.0,0.000,0.000,0.000,0.000,0.00000,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,WI
1,08-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,1.00306,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
2,09-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.97901,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
3,10-Apr-14,7405,15/9-F-1 C,0.0,,,,0.000,0.54576,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
4,11-Apr-14,7405,15/9-F-1 C,0.0,310.376,96.876,277.278,0.000,1.21599,%,33.098,10.480,33.072,0.0,0.0,0.0,,production,OP
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15629,14-Sep-16,5769,15/9-F-5,0.0,,,,0.273,0.63609,%,0.078,0.229,0.019,0.0,0.0,0.0,,production,OP
15630,15-Sep-16,5769,15/9-F-5,0.0,,,,0.287,0.67079,%,0.085,0.229,0.006,0.0,0.0,0.0,,production,OP
15631,16-Sep-16,5769,15/9-F-5,0.0,,,,0.286,0.66439,%,0.085,0.229,0.012,0.0,0.0,0.0,,production,OP
15632,17-Sep-16,5769,15/9-F-5,0.0,,,,0.272,0.62466,%,0.075,0.228,0.026,0.0,0.0,0.0,,production,OP


In [35]:
# Giving your own column names
pd.read_csv('vpd.csv', names=['pfs','data','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s'])

Unnamed: 0,pfs,data,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s
0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
1,07-Apr-14,7405,15/9-F-1 C,0,0,0,0,0,0,%,0,0,0,0,0,0,,production,WI
2,08-Apr-14,7405,15/9-F-1 C,0,,,,0,1.00306,%,0,0,0,0,0,0,,production,OP
3,09-Apr-14,7405,15/9-F-1 C,0,,,,0,0.97901,%,0,0,0,0,0,0,,production,OP
4,10-Apr-14,7405,15/9-F-1 C,0,,,,0,0.54576,%,0,0,0,0,0,0,,production,OP
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15630,14-Sep-16,5769,15/9-F-5,0,,,,0.273,0.63609,%,0.078,0.229,0.019,0,0,0,,production,OP
15631,15-Sep-16,5769,15/9-F-5,0,,,,0.287,0.67079,%,0.085,0.229,0.006,0,0,0,,production,OP
15632,16-Sep-16,5769,15/9-F-5,0,,,,0.286,0.66439,%,0.085,0.229,0.012,0,0,0,,production,OP
15633,17-Sep-16,5769,15/9-F-5,0,,,,0.272,0.62466,%,0.075,0.228,0.026,0,0,0,,production,OP


In [38]:
# Skipping rows
pd.read_csv('vpd.csv',skiprows=[3,5,7,8]).head(20)

Unnamed: 0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
0,07-Apr-14,7405,15/9-F-1 C,0.0,0.0,0.0,0.0,0.0,0.0,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,WI
1,08-Apr-14,7405,15/9-F-1 C,0.0,,,,0.0,1.00306,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,OP
2,10-Apr-14,7405,15/9-F-1 C,0.0,,,,0.0,0.54576,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,OP
3,12-Apr-14,7405,15/9-F-1 C,0.0,303.501,96.923,281.447,0.0,3.08702,%,22.053,8.704,22.053,0.0,0.0,0.0,,production,OP
4,15-Apr-14,7405,15/9-F-1 C,0.0,303.858,97.021,289.941,0.0,31.14186,%,13.918,8.498,12.182,0.0,0.0,0.0,,production,OP
5,16-Apr-14,7405,15/9-F-1 C,0.0,303.792,97.066,299.672,0.0,0.0,%,4.12,8.821,1.49,0.0,0.0,0.0,,production,OP
6,17-Apr-14,7405,15/9-F-1 C,0.0,304.335,96.919,282.901,0.0,41.23599,%,21.434,8.854,18.795,0.0,0.0,0.0,,production,OP
7,18-Apr-14,7405,15/9-F-1 C,0.0,304.849,96.72,273.701,0.0,0.0,%,31.148,9.64,28.503,0.0,0.0,0.0,,production,OP
8,19-Apr-14,7405,15/9-F-1 C,0.0,305.371,96.616,259.62,0.0,0.43686,%,45.752,9.639,43.157,0.0,0.0,0.0,,production,OP
9,20-Apr-14,7405,15/9-F-1 C,0.0,313.871,96.56,282.814,0.0,0.45428,%,31.056,9.62,28.484,0.0,0.0,0.0,,production,OP


In [41]:
pd.read_csv('vpd.csv').info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15634 entries, 0 to 15633
Data columns (total 19 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   DATEPRD                   15634 non-null  object 
 1   NPD_WELL_BORE_CODE        15634 non-null  int64  
 2   NPD_WELL_BORE_NAME        15634 non-null  object 
 3   ON_STREAM_HRS             15349 non-null  float64
 4   AVG_DOWNHOLE_PRESSURE     8980 non-null   float64
 5   AVG_DOWNHOLE_TEMPERATURE  8980 non-null   float64
 6   AVG_DP_TUBING             8980 non-null   float64
 7   AVG_ANNULUS_PRESS         7890 non-null   float64
 8   AVG_CHOKE_SIZE_P          8919 non-null   float64
 9   AVG_CHOKE_UOM             9161 non-null   object 
 10  AVG_WHP_P                 9155 non-null   float64
 11  AVG_WHT_P                 9146 non-null   float64
 12  DP_CHOKE_SIZE             15340 non-null  float64
 13  BORE_OIL_VOL              9161 non-null   float64
 14  BORE_G

In [43]:
pd.read_csv('vpd.csv',parse_dates=[0])

Unnamed: 0,DATEPRD,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
0,2014-04-07,7405,15/9-F-1 C,0.0,0.000,0.000,0.000,0.000,0.00000,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,WI
1,2014-04-08,7405,15/9-F-1 C,0.0,,,,0.000,1.00306,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
2,2014-04-09,7405,15/9-F-1 C,0.0,,,,0.000,0.97901,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
3,2014-04-10,7405,15/9-F-1 C,0.0,,,,0.000,0.54576,%,0.000,0.000,0.000,0.0,0.0,0.0,,production,OP
4,2014-04-11,7405,15/9-F-1 C,0.0,310.376,96.876,277.278,0.000,1.21599,%,33.098,10.480,33.072,0.0,0.0,0.0,,production,OP
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15629,2016-09-14,5769,15/9-F-5,0.0,,,,0.273,0.63609,%,0.078,0.229,0.019,0.0,0.0,0.0,,production,OP
15630,2016-09-15,5769,15/9-F-5,0.0,,,,0.287,0.67079,%,0.085,0.229,0.006,0.0,0.0,0.0,,production,OP
15631,2016-09-16,5769,15/9-F-5,0.0,,,,0.286,0.66439,%,0.085,0.229,0.012,0.0,0.0,0.0,,production,OP
15632,2016-09-17,5769,15/9-F-5,0.0,,,,0.272,0.62466,%,0.075,0.228,0.026,0.0,0.0,0.0,,production,OP


In [44]:
pd.to_datetime(df['DATEPRD'])

0       2014-04-07
1       2014-04-08
2       2014-04-09
3       2014-04-10
4       2014-04-11
           ...    
15629   2016-09-14
15630   2016-09-15
15631   2016-09-16
15632   2016-09-17
15633   2016-09-18
Name: DATEPRD, Length: 15634, dtype: datetime64[ns]

In [48]:
pd.read_csv('s.csv',sep = ';')

Unnamed: 0,id,name,email,amount,date,sent
0,1,Hopper,email-address,269,03 Sep at 21:14,False
1,2,Drake,email-address,690,03 Sep at 21:14,False
2,3,Adam,email-address,20,03 Sep at 21:14,False
3,4,Justin,email-addres,199,03 Sep at 21:14,False


In [49]:
pd.read_csv('s.csv',sep = ';').shape

(4, 6)

![logo.jpg](attachment:logo.jpg)

### Reading HTML:  Fetching Tabular data from given website link

In [100]:
import pandas as pd

In [101]:
pd.read_html('https://www.macrotrends.net/1369/crude-oil-price-history-chart?q=')

[   Crude Oil Prices - Historical Annual Data                                 \
                                         Year AverageClosing Price Year Open   
 0                                       2022               $77.60    $76.08   
 1                                       2021               $68.17    $47.62   
 2                                       2020               $39.68    $61.17   
 3                                       2019               $56.99    $46.31   
 4                                       2018               $65.23    $60.37   
 5                                       2017               $50.80    $52.36   
 6                                       2016               $43.29    $36.81   
 7                                       2015               $48.66    $52.72   
 8                                       2014               $93.17    $95.14   
 9                                       2013               $97.98    $93.14   
 10                                     

In [102]:
s = pd.read_html('https://www.macrotrends.net/1369/crude-oil-price-history-chart?q=')

In [103]:
s

[   Crude Oil Prices - Historical Annual Data                                 \
                                         Year AverageClosing Price Year Open   
 0                                       2022               $77.60    $76.08   
 1                                       2021               $68.17    $47.62   
 2                                       2020               $39.68    $61.17   
 3                                       2019               $56.99    $46.31   
 4                                       2018               $65.23    $60.37   
 5                                       2017               $50.80    $52.36   
 6                                       2016               $43.29    $36.81   
 7                                       2015               $48.66    $52.72   
 8                                       2014               $93.17    $95.14   
 9                                       2013               $97.98    $93.14   
 10                                     

In [104]:
type(s)

list

In [105]:
len(s)

3

In [109]:
df = s[0]

In [107]:
s[1]

Unnamed: 0,Link Preview,HTML Code (Click to Copy)
0,Crude Oil Prices - 70 Year Historical Chart,
1,Macrotrends,
2,Source,


In [108]:
s[2]

Unnamed: 0,Link Preview,HTML Code (Click to Copy)
0,Crude Oil Prices - 70 Year Historical Chart,
1,Macrotrends,
2,Source,


In [110]:
df

Unnamed: 0_level_0,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data,Crude Oil Prices - Historical Annual Data
Unnamed: 0_level_1,Year,AverageClosing Price,Year Open,Year High,Year Low,Year Close,Annual% Change
0,2022,$77.60,$76.08,$79.46,$76.08,$79.46,5.65%
1,2021,$68.17,$47.62,$84.65,$47.62,$75.21,55.01%
2,2020,$39.68,$61.17,$63.27,$11.26,$48.52,-20.64%
3,2019,$56.99,$46.31,$66.24,$46.31,$61.14,35.42%
4,2018,$65.23,$60.37,$77.41,$44.48,$45.15,-25.32%
5,2017,$50.80,$52.36,$60.46,$42.48,$60.46,12.48%
6,2016,$43.29,$36.81,$54.01,$26.19,$53.75,44.76%
7,2015,$48.66,$52.72,$61.36,$34.55,$37.13,-30.53%
8,2014,$93.17,$95.14,$107.95,$53.45,$53.45,-45.55%
9,2013,$97.98,$93.14,$110.62,$86.65,$98.17,6.90%


In [111]:
df.shape

(36, 7)

In [112]:
df.columns

MultiIndex([('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...),
            ('Crude Oil Prices - Historical Annual Data', ...)],
           )

In [117]:
df.columns[0][1]

'Year'

In [118]:
name = []

for i in df.columns:
    a = i[1]
    name.append(a)

In [119]:
name

['Year',
 'AverageClosing Price',
 'Year Open',
 'Year High',
 'Year Low',
 'Year Close',
 'Annual% Change']

In [120]:
df.columns = name

In [121]:
df

Unnamed: 0,Year,AverageClosing Price,Year Open,Year High,Year Low,Year Close,Annual% Change
0,2022,$77.60,$76.08,$79.46,$76.08,$79.46,5.65%
1,2021,$68.17,$47.62,$84.65,$47.62,$75.21,55.01%
2,2020,$39.68,$61.17,$63.27,$11.26,$48.52,-20.64%
3,2019,$56.99,$46.31,$66.24,$46.31,$61.14,35.42%
4,2018,$65.23,$60.37,$77.41,$44.48,$45.15,-25.32%
5,2017,$50.80,$52.36,$60.46,$42.48,$60.46,12.48%
6,2016,$43.29,$36.81,$54.01,$26.19,$53.75,44.76%
7,2015,$48.66,$52.72,$61.36,$34.55,$37.13,-30.53%
8,2014,$93.17,$95.14,$107.95,$53.45,$53.45,-45.55%
9,2013,$97.98,$93.14,$110.62,$86.65,$98.17,6.90%
