# Argentina Neuquen O&G Project

#### INTRODUCTION

##### My project uses a database from a public website:
##### https://hidrocarburos.energianeuquen.gob.ar/portalgis/web/)

##### This website is used as a vehicle of information between the local government of the Neuquen Province in Argentina and the public. Entities working with these data are oil companies, market analysts, economists and others working in the energy sector.

#####  The data covers basic aspects of energy activities in the Neuquen Province of Argentina. The information is provided as Geodatabases which contain various files. The files that I am using are .dbs files which can be opened by Excel and also pandas and hence cleaned up using python. Most of that work was done previously but some elimination of invalid rows and simplification of choices needs to be done.

##### All the information is in Spanish but some column labels have been translated to English to facilitate their identification by English speakers.

##### My objective is to build a webApp that contains infomation about the the wells drilled in the Neuquen Province. 

##### Information from these tables will be used by Tableau to be queried and and displayed interactively by the user. 

##### The website will also contain an interactive contour map of calculated properties based on simple physics which will come from python-matplotlib

##### Below is the link to the tables:

### https://github.com/ortegaorlando2/Argentina.git

### Graphical Summary of the proposal

[![Image from Gyazo](https://i.gyazo.com/2e400a5b639255bc10a90bcb02178058.png)]

In [17]:
# Dependencies
import pandas as pd
import csv 

In [20]:
path="WellsSimpleIndexP.csv"

In [21]:
Wells = pd.read_csv(path)
Wells = pd.DataFrame(Wells).dropna()
Wells

Unnamed: 0,WID,WELL_NAME,WELL_S_NAME,ELEVATION,DEPTH_m,ID_NQN,LATITUDE,LONGITUDE,F_ID,RID,Geometry_ID,OP_ID,ST_ID,FM_ID
0,0,PEL.Nq.BoMo-14,BoMo-14,544.10,2805.0,A087,-38.14176014,-68.28363303,4,0,0,17,1,22
1,1,PPC.Nq.EC-2,EC-2,536.70,2607.0,A087,-37.87344702,-68.44036744,4,0,0,17,0,22
2,2,PPC.Nq.EC-4,EC-4,555.60,2585.0,A087,-37.95364471,-68.44760234,4,1,0,17,1,23
3,3,PPC.Nq.EC-7,EC-7,527.10,2451.0,A087,-37.88423618,-68.43573887,0,0,0,17,1,19
4,4,PPC.Nq.EC-10,EC-10,537.40,2648.0,A087,-37.96340915,-68.43183347,4,0,0,17,2,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14748,14748,YPF.Nq.ChSN-615,ChSN-615,983.14,1240.0,A048,-37.30150365,-69.26024478,4,0,0,2,1,0
14749,14749,YPF.Nq.ChSN-633(d),ChSN-633(d),941.48,1275.0,A048,-37.30373585,-69.24386619,4,0,1,2,1,0
14750,14750,YPF.Nq.ChSN-702(h),ChSN-702(h),999999.00,999999.0,A048,Not reported,Not reported,8,1,2,2,10,23
14751,14751,YPF.Nq.ChSN-703(h),ChSN-703(h),999999.00,999999.0,A048,Not reported,Not reported,8,1,2,2,10,23


# Proposal Split 


### 1) Tableau to guide user insights about this hydrocarbon-rich area of the world

##### a) Based on csv(s) already cleaned in ETL project

In [None]:
##### a) Based on csv(s) already cleaned in ETL project

### 2) Machine learning to predict the probability of finding oil based on depth in the subsurface, latitude and longitude provided by the user and deliver a call for action (OPTION 1)

##### a) Based on csv(s) already cleaned in ETL project

In [None]:
##### a) Based on csv(s) already cleaned in ETL project

In [None]:
#### b) Jupyter Notebook for Machine Learning (may be more than 1)

In [None]:
import pandas as pd
import datetime
import time
import pickle
import numpy as np

class ModelHelper():
    def __init__(self):
        pass

    def makePredictions(self, sex_flag, age, fare, familySize, pclass, embarked):
        pclass_1 = 0
        pclass_2 = 0
        pclass_3 = 0

        embarked_c = 0
        embarked_q = 0
        embarked_s = 0

        # parse pclass
        if (pclass == 1):
            pclass_1 = 1
        elif (pclass == 2):
            pclass_2 = 1
        elif (pclass == 3):
            pclass_3 = 1
        else:
            pass

        # parse embarked
        if (embarked == "C"):
            embarked_c = 1
        elif (embarked == "Q"):
            embarked_q = 1
        elif (embarked == "S"):
            embarked_s = 1
        else:
            pass

        input_pred = [[sex_flag, age, fare, familySize, pclass_1, pclass_2, pclass_3, embarked_c, embarked_q, embarked_s]]


        filename = 'finalized_model.sav'
        ada_load = pickle.load(open(filename, 'rb'))

        X = np.array(input_pred)
        preds = ada_load.predict_proba(X)
        preds_singular = ada_load.predict(X)

        return preds_singular[0]


### 3) Full stack web application (flask) hosted on Heroku. This application will be powered by live AI serving customized user prediction. It will also tell an interactive story using Tableau.

##### a) Flask will be like

In [None]:
from flask import Flask, render_template, jsonify, send_from_directory, request
import json
import pandas as pd
import numpy as np
import os
from modelHelper import ModelHelper

#init app and class
app = Flask(__name__)
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
modelHelper = ModelHelper()

#endpoint
# Favicon
@app.route('/favicon.ico')
def favicon():
    return send_from_directory(os.path.join(app.root_path, 'static'),
                          'favicon.ico',mimetype='image/vnd.microsoft.icon')

# Route to render index.html template
@app.route("/")
def home():
    # Return template and data
    return render_template("index.html")

@app.route("/makePredictions", methods=["POST"])
def makePredictions():
    content = request.json["data"]

    # parse
    sex_flag = int(content["sex_flag"])
    age = float(content["age"])
    fare = float(content["fare"])
    familySize = int(content["familySize"])
    p_class = int(content["p_class"])
    embarked = content["embarked"]

    # #dummy data
    # sex_flag = 1
    # age = 25
    # fare = 25
    # familySize = 2
    # p_class = 1
    # embarked = "C"

    prediction = modelHelper.makePredictions(sex_flag, age, fare, familySize, p_class, embarked)
    print(prediction)
    return(jsonify({"ok": True, "prediction": str(prediction)}))

####################################
# ADD MORE ENDPOINTS

###########################################

#############################################################

@app.after_request
def add_header(r):
    """
    Add headers to both force latest IE rendering engine or Chrome Frame,
    and also to cache the rendered page for 10 minutes.
    """
    r.headers['X-UA-Compatible'] = 'IE=Edge,chrome=1'
    r.headers["Cache-Control"] = "no-cache, no-store, must-revalidate, public, max-age=0"
    r.headers["Pragma"] = "no-cache"
    r.headers["Expires"] = "0"
    return r

#main
if __name__ == "__main__":
    app.run(debug=True)

## Contour maps done in matplotlib like this (OPTION 2... prelim approved by TAs)

In [None]:
import pandas as pd
import numpy as np
data_url = 'https://raw.githubusercontent.com/alexmill/website_notebooks/master/data/data_3d_contour.csv'
contour_data = pd.read_csv(data_url)
contour_data.head()

import numpy as np

Z = contour_data.pivot_table(index='x', columns='y', values='z').T.values

X_unique = np.sort(contour_data.x.unique())
Y_unique = np.sort(contour_data.y.unique())
X, Y = np.meshgrid(X_unique, Y_unique)

pd.DataFrame(Z).round(3)

X_unique,Y_unique

(array([0.      , 0.19897 , 0.349485, 0.5     , 0.69897 , 0.849485, 1.      ]),
 array([0.        , 0.26315789, 0.52631579, 0.63157895, 0.73684211, 0.84210526, 0.94736842, 1.        ]))

pd.DataFrame(X).round(3)
pd.DataFrame(Y).round(3)

from IPython.display import set_matplotlib_formats
%matplotlib inline
set_matplotlib_formats('svg')

import matplotlib.pyplot as plt
from matplotlib import rcParams


# Initialize plot objects
rcParams['figure.figsize'] = 5, 5 # sets plot size
fig = plt.figure()
ax = fig.add_subplot(111)

# Generate a contour plot
cp = ax.contour(X, Y, Z)

# Initialize plot objects
rcParams['figure.figsize'] = 5, 5 # sets plot size
fig = plt.figure()
ax = fig.add_subplot(111)

# Define levels in z-axis where we want lines to appear
levels = np.array([-0.4,-0.2,0,0.2,0.4])

# Generate a color mapping of the levels we've specified
import matplotlib.cm as cm # matplotlib's color map library
cpf = ax.contourf(X,Y,Z, len(levels), cmap=cm.Reds)

# Set all level lines to black
line_colors = ['black' for l in cpf.levels]

# Make plot and customize axes
cp = ax.contour(X, Y, Z, levels=levels, colors=line_colors)
ax.clabel(cp, fontsize=10, colors=line_colors)
plt.xticks([0,0.5,1])
plt.yticks([0,0.5,1])
ax.set_xlabel('X-axis')
_ = ax.set_ylabel('Y-axis')
#plt.savefig('figure.pdf') # uncomment to save vector/high-res version



#### c) Theme with botswatch

## javascript will look like this

In [None]:
$(document).ready(function() {
    console.log("Page Loaded");

    $("#filter").click(function() {
        makePredictions();
    });
});

// call Flask API endpoint
function makePredictions() {
    var sex_flag = $("#gender").val();
    var age = $("#age").val();
    var fare = $("#fare").val();
    var familySize = $("#familySize").val();
    var p_class = $("#pclass").val();
    var embarked = $("#embarked").val();

    // create the payload
    var payload = {
        "sex_flag": sex_flag,
        "age": age,
        "fare": fare,
        "familySize": familySize,
        "p_class": p_class,
        "embarked": embarked
    }

    // Perform a POST request to the query URL
    $.ajax({
        type: "POST",
        url: "/makePredictions",
        contentType: 'application/json;charset=UTF-8',
        data: JSON.stringify({ "data": payload }),
        success: function(returnedData) {
            // print it
            console.log(returnedData);

            if (returnedData["prediction"] == 1) {
                $("#output").text("You Survived!");
            } else {
                $("#output").text("You Died!");
            }
        },
        error: function(XMLHttpRequest, textStatus, errorThrown) {
            alert("Status: " + textStatus);
            alert("Error: " + errorThrown);
        }
    });

}


### 4) Essay about the geological interpretation based on the data

##### a) I am familiar with this area and data but still learning as they adapt to new realities and change strategies

...
blah


blah.

#### Full Analytical Writeup (Optional Executive Summary) + Works Cited

### 5) Presentation Slide Deck

### 6) Deployment to Heroku

#### a) Procfile will look like this

In [7]:
web: gunicorn app:app

#### b) Requirements.txt will look like this:

In [None]:
certifi==2020.12.5
click==7.1.2
Flask==1.1.2
gunicorn==20.0.4
itsdangerous==1.1.0
Jinja2==2.11.3
joblib==1.0.1
MarkupSafe==1.1.1
numpy==1.19.5
pandas==1.1.5
python-dateutil==2.8.1
pytz==2021.1
scikit-learn==0.23.2
scipy==1.5.4
six==1.15.0
threadpoolctl==2.1.0
Werkzeug==1.0.1
wincertstore==0.2


#### c) Templates will look like this:

In [None]:
index.html

### 7) Upload to github