# Argentina Neuquen O&G Project

#### INTRODUCTION

##### My project uses a database from a public website:
##### https://hidrocarburos.energianeuquen.gob.ar/portalgis/web/)

##### This website is used as a vehicle of information between the local government of the Neuquen Province in Argentina and the public. Entities working with these data are oil companies, market analysts, economists and others working in the energy sector.

#####  The data covers basic aspects of energy activities in the Neuquen Province of Argentina. The information is provided as Geodatabases which contain various files. The files that I am using are .dbs files which can be opened by Excel and also pandas and hence cleaned up using python. Most of that work was done previously but some elimination of invalid rows and simplification of choices needs to be done.

##### All the information is in Spanish but some column labels have been translated to English to facilitate their identification by English speakers.

##### My objective is to build a webApp that contains infomation about the the wells drilled in the Neuquen Province. 

##### Information from these tables will be used by Tableau to be queried and and displayed interactively by the user. 

##### The website will also contain an interactive contour map of calculated properties based on simple physics which will come from python-matplotlib

##### Below are examples of the tables:

[![Image from Gyazo](https://i.gyazo.com/2e400a5b639255bc10a90bcb02178058.png)]

In [5]:
# Dependencies
import pandas as pd
import csv 

In [6]:
path="../ContractSimple.csv"

In [10]:
Areas = pd.read_csv(path)
Areas = pd.DataFrame(Areas).dropna()
Areas

Unnamed: 0,CONTRACT_TYPE,INDEX2
0,CONCESION DE EXPLOTACION NO CONVENCIONAL,3.0
1,CONCESION DE EXPLOTACION,1.0
2,PERMISO DE EXPLORACION NO CONVENCIONAL,7.0
3,SIN CONTRATO,10.0
4,REVERSION,9.0
5,LOTE DE EXPLOTACION,5.0
6,LOTE BAJO EVALUACION,4.0
7,CONCESION DE EXPLOTACION MINERA,2.0
8,PRESTACION DE SERVICIO,8.0
9,PERMISO DE EXPLORACION,6.0


Project Proposal:

Project Proposal is due by end-of-class Wednesday.

It needs to include:

Your dataset (include link)
What you plan/hope to predict
Screenshots for tableau inspiration
Links to code/notebooks for ML inspiration
Break down of roles/responsibilities
Basic hypotheses, trends
Design inspirations - color schemes, fonts, etc.


It is preferred that each group takes a different topic, so try and call/get it approved by me or the TAs as soon as you can!

Roles and Responsibilities:

Who is leading ML? (usually 1 or 2 people)
Who is doing Tableau ? (usually 1 or 2 people)
Who is leading the website/flask construction? (usually 1 person)
Who is leading the writeup/presentation slide deck creation? (usually 1 person)

Rough Calendar

By EOD 4/21: Project Proposal is due. Dataset is found. Roles are decided. You are ahead if you have loaded your data into Python for basic exploration and understanding. UNDERSTAND YOUR DATA


By EOD 4/24: You have cleaned your data in such a way that it can now be loaded into Tableau. You have made a couple of initial vizzes. You have started cleaning/prepping the data for machine learning or perhaps built an initial model. DATASET IS LOCKED


By EOD 4/26: Machine Learning is basically done. Tableau is going well - maybe one dashboard is done, working on the second


By EOD 4/28: Website/Flask is started. Writeup is started. Tableau has made demonstrable progress. ML is finished and saved (pickled)


By EOD 5/1: Website work continues. Tableau is finished and embedded in HTML. Writeup is essentially done. Slide deck is started


By EOD 5/3: Final Push. Website is functional if not deployed. Tableau has final tweaks. Writeup is done. Presentation is planned out. I AM BACK and will help to deploy to Heroku.


By EOD 5/5: Project Presentations


Other Notes:

For Tableau, I expect to have at least two dashboards
Tableau must be deployed to Tableau Public
I recommend you embed Tableau using the Tableau JS API
I recommend you download your CSV data from Kaggle. I recommend you do NOT use an API
Do not touch a database
This is not an ETL project. Focus on the ML and Tableau - not data cleaning
Don't be afraid to pursue a complicated topic like NLP or Computer Vision - just know that will take additional time and effort
I am not grading individual contributions to GitHub. Use it as a tool to help you keep track of code. Don't use it for Tableau. All code and writeups must be in ONE Github repo by May 5
Writeups and Slide deck and Project Proposal need to be uploaded in PDF format
I don't care about GitHub ReadMe's - do it if you want but I'm not going to force you
I would learn/use JQuery to send the requests to the Flask endpoints

THEMING

Think about design! Think about art! Think about colors and fonts! Think about user interfaces!
I would love to see a full color palette and chosen font style that matches your project
I recommend you use Bootswatch as an HTML template (in #04_resources)
Feel free to use an external HTML template - just make it consistent

I will likely be available sporadically on my phone lol so don't be startled if I randomly ask a question or put an emoji in your chat. Don't use me for advanced technical questions though - but I will be checking in every now and then.

Common Projects:

Movies
Wine
Crime
WHO Data
Census Data
Nutrition
Sports
Baseball
Health
Covid
Spotify/Music
Starbucks
Earthquakes
Stonks :chart_with_upwards_trend:
Tweets
Dogs
Dinosaurs
Flowers
Computer vision - ResNet

Bumping:


here are some past projects:

https://dallas-crime-smu.herokuapp.com/

https://texas-accidents-smu.herokuapp.com/

https://smu-2021-capstone-group3.herokuapp.com

Here is the GitHub of the Crime Capstone group:

https://github.com/alexarnold630/Dallas_Crime_Analysis_and_Prediction/tree/main/Crime

I recommend you look at their reports, slide decks, and flask app

Here is the car accident's repo. Again, look at the reports and the flask app

https://github.com/RyanPerm/Capstone_Final_Project

#### Proposal


### 1) Tableau to guide user insights about this hydrocarbon-rich area of the world

##### a) Based on csv(s) already cleaned in ETL project

In [None]:
##### a) Based on csv(s) already cleaned in ETL project

### 2) Machine learning to predict the probability of finding oil based on depth in the subsurface, latitude and longitude provided by the user and deliver a call for action (OPTION 1)

##### a) Based on csv(s) already cleaned in ETL project

In [None]:
##### a) Based on csv(s) already cleaned in ETL project

In [None]:
#### b) Jupyter Notebook for Machine Learning (may be more than 1)

In [None]:
import pandas as pd
import datetime
import time
import pickle
import numpy as np

class ModelHelper():
    def __init__(self):
        pass

    def makePredictions(self, sex_flag, age, fare, familySize, pclass, embarked):
        pclass_1 = 0
        pclass_2 = 0
        pclass_3 = 0

        embarked_c = 0
        embarked_q = 0
        embarked_s = 0

        # parse pclass
        if (pclass == 1):
            pclass_1 = 1
        elif (pclass == 2):
            pclass_2 = 1
        elif (pclass == 3):
            pclass_3 = 1
        else:
            pass

        # parse embarked
        if (embarked == "C"):
            embarked_c = 1
        elif (embarked == "Q"):
            embarked_q = 1
        elif (embarked == "S"):
            embarked_s = 1
        else:
            pass

        input_pred = [[sex_flag, age, fare, familySize, pclass_1, pclass_2, pclass_3, embarked_c, embarked_q, embarked_s]]


        filename = 'finalized_model.sav'
        ada_load = pickle.load(open(filename, 'rb'))

        X = np.array(input_pred)
        preds = ada_load.predict_proba(X)
        preds_singular = ada_load.predict(X)

        return preds_singular[0]


### 3) Full stack web application (flask) hosted on Heroku. This application will be powered by live AI serving customized user prediction. It will also tell an interactive story using Tableau.

##### a) Flask will be like

In [None]:
from flask import Flask, render_template, jsonify, send_from_directory, request
import json
import pandas as pd
import numpy as np
import os
from modelHelper import ModelHelper

#init app and class
app = Flask(__name__)
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
modelHelper = ModelHelper()

#endpoint
# Favicon
@app.route('/favicon.ico')
def favicon():
    return send_from_directory(os.path.join(app.root_path, 'static'),
                          'favicon.ico',mimetype='image/vnd.microsoft.icon')

# Route to render index.html template
@app.route("/")
def home():
    # Return template and data
    return render_template("index.html")

@app.route("/makePredictions", methods=["POST"])
def makePredictions():
    content = request.json["data"]

    # parse
    sex_flag = int(content["sex_flag"])
    age = float(content["age"])
    fare = float(content["fare"])
    familySize = int(content["familySize"])
    p_class = int(content["p_class"])
    embarked = content["embarked"]

    # #dummy data
    # sex_flag = 1
    # age = 25
    # fare = 25
    # familySize = 2
    # p_class = 1
    # embarked = "C"

    prediction = modelHelper.makePredictions(sex_flag, age, fare, familySize, p_class, embarked)
    print(prediction)
    return(jsonify({"ok": True, "prediction": str(prediction)}))

####################################
# ADD MORE ENDPOINTS

###########################################

#############################################################

@app.after_request
def add_header(r):
    """
    Add headers to both force latest IE rendering engine or Chrome Frame,
    and also to cache the rendered page for 10 minutes.
    """
    r.headers['X-UA-Compatible'] = 'IE=Edge,chrome=1'
    r.headers["Cache-Control"] = "no-cache, no-store, must-revalidate, public, max-age=0"
    r.headers["Pragma"] = "no-cache"
    r.headers["Expires"] = "0"
    return r

#main
if __name__ == "__main__":
    app.run(debug=True)

#### b) Contour maps done in mmatplotlib like this

In [None]:
import pandas as pd
import numpy as np
data_url = 'https://raw.githubusercontent.com/alexmill/website_notebooks/master/data/data_3d_contour.csv'
contour_data = pd.read_csv(data_url)
contour_data.head()


#### c) css will look like these two

In [None]:
body {
  font-family: Arial;
  color: white;
}

.split {
  height: 100%;
  width: 50%;
  position: fixed;
  z-index: 1;
  top: 0;
  overflow-x: hidden;
  padding-top: 20px;
}

.left {
  left: 0;
  background-color: #111;
}

.right {
  right: 0;
  background-color: red;
}

.centered {
  position: absolute;
  top: 50%;
  left: 50%;
  transform: translate(-50%, -50%);
  text-align: center;
}

.centered img {
  width: 150px;
  border-radius: 50%;
}
</style>

In [None]:
.box-set {
  position: relative;
  height: 450px;
  background:  #1a146e;
}

.box {
  /* Absolute Position */
  position: absolute;
  width: 150px;
  height: 150px;
  background:  #1a146e;
  border: 0px solid white;
}
.box-1 {
  top: 200px;
  left: -300px;
}

.box-2 {
  top: 200px;
  right: 500px;
}

.box-3 {
  right: 500px;
  bottom: -200px;
}

.box-4 {
  bottom: -200px;
  left: -300px
}
      h1 {
        text-align: center;
        background-color: rgb(223, 235, 234);
        color: red;
      }
      
      h2 {
        background-color: rgb(25, 236, 236);
        color: black;
      }
      
      h3 {
        color: rgb(19, 21, 173);
      }
      
      img {
        box-shadow: 3px 3px 5px white;
      }
      
      section {
        background-color: #000;
        color: deeppink
      }
      
      p {
        font-size: 15px;
        color: gray;
      }
      
      ul {
        border-style: solid;
        color: gray;
      }

### javascript will look like this

In [None]:
$(document).ready(function() {
    console.log("Page Loaded");

    $("#filter").click(function() {
        makePredictions();
    });
});

// call Flask API endpoint
function makePredictions() {
    var sex_flag = $("#gender").val();
    var age = $("#age").val();
    var fare = $("#fare").val();
    var familySize = $("#familySize").val();
    var p_class = $("#pclass").val();
    var embarked = $("#embarked").val();

    // create the payload
    var payload = {
        "sex_flag": sex_flag,
        "age": age,
        "fare": fare,
        "familySize": familySize,
        "p_class": p_class,
        "embarked": embarked
    }

    // Perform a POST request to the query URL
    $.ajax({
        type: "POST",
        url: "/makePredictions",
        contentType: 'application/json;charset=UTF-8',
        data: JSON.stringify({ "data": payload }),
        success: function(returnedData) {
            // print it
            console.log(returnedData);

            if (returnedData["prediction"] == 1) {
                $("#output").text("You Survived!");
            } else {
                $("#output").text("You Died!");
            }
        },
        error: function(XMLHttpRequest, textStatus, errorThrown) {
            alert("Status: " + textStatus);
            alert("Error: " + errorThrown);
        }
    });

}


### 4) Essay about the geological interpretation based on the data

##### a) I am familiar with this area and data but still learning as they adapt to new realities and change strategies

...
blah


blah.

#### Full Analytical Writeup (Optional Executive Summary) + Works Cited

### 5) Presentation Slide Deck

### 6) Deployment to Heroku

#### a) Procfile will look like this

In [7]:
web: gunicorn app:app

#### b) Requirements.txt will look like this:

In [None]:
certifi==2020.12.5
click==7.1.2
Flask==1.1.2
gunicorn==20.0.4
itsdangerous==1.1.0
Jinja2==2.11.3
joblib==1.0.1
MarkupSafe==1.1.1
numpy==1.19.5
pandas==1.1.5
python-dateutil==2.8.1
pytz==2021.1
scikit-learn==0.23.2
scipy==1.5.4
six==1.15.0
threadpoolctl==2.1.0
Werkzeug==1.0.1
wincertstore==0.2


#### c) Templates will look like this:

In [None]:
index.html

### 7) Upload to github