# Crude Oil Production Analysis

## Introduction

Crude oil production is a critical component of the global energy market and has significant implications for economies and industries worldwide. This project aims to analyze crude oil production data to uncover trends, patterns, and insights that can inform decision-making in the energy sector.

### Objectives
- **Data Collection**: Gather historical crude oil production data from reliable sources.
- **Data Cleaning**: Process the data to handle missing values, outliers, and inconsistencies.
- **Exploratory Data Analysis (EDA)**: Use statistical methods and visualizations to explore the data.
- **Trend Analysis**: Identify and analyze long-term trends in crude oil production.
- **Predictive Modeling**: Build models to forecast future production levels.

### Dataset
The dataset used in this project includes:
- Historical crude oil production data of Volve field.

### Tools and Technologies
- **Python**: Programming language used for data analysis and modeling.
- **Pandas**: Library for data manipulation and analysis.
- **Plotly.JS**: Libraries for data visualization.
- **Scikit-learn**: Machine learning library for predictive modeling.

### Structure of the Notebook
1. **Data Collection and Cleaning**: Steps to gather and preprocess the data.
2. **Exploratory Data Analysis**: Visualizations and statistical analysis of the data.
3. **Trend Analysis**: Examination of production trends over time.
4. **Predictive Modeling**: Development and evaluation of predictive models.
5. **Conclusions and Insights**: Key findings and their implications for the industry.

By the end of this project, we aim to provide a comprehensive analysis of crude oil production trends and deliver actionable insights that can help stakeholders in making informed decisions.


## Importing libraries and getting started

In [6]:
import pandas as pd
import numpy as np

import plotly.express as px
from sklearn.preprocessing import StandardScaler

seed = 0

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)

In [5]:
data_new = pd.read_excel('data/volve-field-daily-data.xlsx')

In [7]:
data_old = pd.read_csv('data/volve_field_data.csv')

In [8]:
data_new.head()

Unnamed: 0,DATEPRD,WELL_BORE_CODE,NPD_WELL_BORE_CODE,NPD_WELL_BORE_NAME,NPD_FIELD_CODE,NPD_FIELD_NAME,NPD_FACILITY_CODE,NPD_FACILITY_NAME,ON_STREAM_HRS,AVG_DOWNHOLE_PRESSURE,AVG_DOWNHOLE_TEMPERATURE,AVG_DP_TUBING,AVG_ANNULUS_PRESS,AVG_CHOKE_SIZE_P,AVG_CHOKE_UOM,AVG_WHP_P,AVG_WHT_P,DP_CHOKE_SIZE,BORE_OIL_VOL,BORE_GAS_VOL,BORE_WAT_VOL,BORE_WI_VOL,FLOW_KIND,WELL_TYPE
0,2014-04-07,NO 15/9-F-1 C,7405,15/9-F-1 C,3420717,VOLVE,369304,MÆRSK INSPIRER,0.0,0.0,0.0,0.0,0.0,0.0,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,WI
1,2014-04-08,NO 15/9-F-1 C,7405,15/9-F-1 C,3420717,VOLVE,369304,MÆRSK INSPIRER,0.0,,,,0.0,1.003059,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,OP
2,2014-04-09,NO 15/9-F-1 C,7405,15/9-F-1 C,3420717,VOLVE,369304,MÆRSK INSPIRER,0.0,,,,0.0,0.979008,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,OP
3,2014-04-10,NO 15/9-F-1 C,7405,15/9-F-1 C,3420717,VOLVE,369304,MÆRSK INSPIRER,0.0,,,,0.0,0.545759,%,0.0,0.0,0.0,0.0,0.0,0.0,,production,OP
4,2014-04-11,NO 15/9-F-1 C,7405,15/9-F-1 C,3420717,VOLVE,369304,MÆRSK INSPIRER,0.0,310.37614,96.87589,277.27826,0.0,1.215987,%,33.09788,10.47992,33.07195,0.0,0.0,0.0,,production,OP


In [25]:
data_new.nunique()

DATEPRD                     3327
WELL_BORE_CODE                 7
NPD_WELL_BORE_CODE             7
NPD_WELL_BORE_NAME             7
NPD_FIELD_CODE                 1
NPD_FIELD_NAME                 1
NPD_FACILITY_CODE              1
NPD_FACILITY_NAME              1
ON_STREAM_HRS                925
AVG_DOWNHOLE_PRESSURE       6567
AVG_DOWNHOLE_TEMPERATURE    6461
AVG_DP_TUBING               8684
AVG_ANNULUS_PRESS           6644
AVG_CHOKE_SIZE_P            6419
AVG_CHOKE_UOM                  1
AVG_WHP_P                   8829
AVG_WHT_P                   8793
DP_CHOKE_SIZE               9057
BORE_OIL_VOL                7818
BORE_GAS_VOL                8005
BORE_WAT_VOL                7361
BORE_WI_VOL                 5258
FLOW_KIND                      2
WELL_TYPE                      2
dtype: int64