

<p align="center">
    <img src="https://github.com/GeostatsGuy/GeostatsPy/blob/master/TCG_color_logo.png?raw=true" width="220" height="240" />

</p>

## PGE 383 Graduate Student Project Template 

#### Michael Pyrcz, Associate Professor, University of Texas at Austin 

##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig)  | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)

#### General Guidance

I'm expecting an **educational product** that could be handed to a student and they could quickly learn a new concept. Ask yourself, could someone new to machine learning understand my workflow?

* this is a win-win since it is  an opportunity to you to dive deeper and explore a concepted related to a machine learning algorithm and is good practice for communicating machine learning with others (something that you will have to do at work)

#### Expectations for Your Well-documented Workflow

* **flow** include a consistent narrative, e.g., no 2 code blocks should be adjacent, always have a short statement to explain, connect to the next code block

* **concise** be as concise as possible:

    * use point form (except for the executive summary) 
    * use effective, creative figures that combine what could have been in multiple plots onto a single plot when possible
    * every line of code, statment or figure must have purpose
    * conciseness is part of the grading, don't add content that isn't needed
    * aim for 4-5 pages of Jupyter notebook
    
* be very clear

    * great executive summary
    * label every axis
    * use readable code, logical variable names, use available functionality for compactness
  
#### Using Code From Others
  
You may use blocks/snipets of code from other sources with citation. To cite a set of code separate in a block and do this in the markdown above the block.

The following code block is from Professor Michael Pyrcz (@GeostatsGuy), SubSurfuceDataAnalytics_PCA.ipynb from [GeostatsGuy GitHub](https://github.com/GeostatsGuy/PythonNumericalDemos/blob/master/SubsurfaceDataAnalytics_PCA.ipynb).

```python
def simple_simple_krige(df,xcol,ycol,vcol,dfl,xlcol,ylcol,vario,skmean):
# load the variogram
    nst = vario['nst']; pmx = 9999.9
    cc = np.zeros(nst); aa = np.zeros(nst); it = np.zeros(nst)
```

or use inline citations such as this for a few of lines of code.

```python
def simple_simple_krige(df,xcol,ycol,vcol,dfl,xlcol,ylcol,vario,skmean): # function from Professor Michael Pyrcz,https://github.com/GeostatsGuy/PythonNumericalDemos/blob/master/SubsurfaceDataAnalytics_PCA.ipynb 
```

#### The Workflow Template

Here's the template for your workflow.

____________________



## Title of Your Workflow

#### Your Name
#### Your Department, School

### Subsurface Machine Learning Course, The University of Texas at Austin
#### Hildebrand Department of Petroleum and Geosystems Engineering, Cockrell School of Engineering
#### Department of Geological Sciences, Jackson School of Geosciences




_____________________

Workflow supervision and review by:

#### Instructor: Prof. Michael Pyrcz, Ph.D., P.Eng., Associate Professor, The Univeristy of Texas at Austin
##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig)  | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)

#### Course TA: Misael Morales, Graduate Student, The University of Texas at Austin
##### [LinkedIn](https://www.linkedin.com/in/misaelmmorales/)


### Executive Summary

* What is the gap, problem, opportunity, scientific question?

* What was done to address the above?

* What was learned?

* What are your recommendations?

**Guidance**: Write as a single paragraph with 4 or so well-writen sentences.

### Import Packages

```python
import numpy as np                                        # for working with data and model arrays
```

**Guidance**: Only include the packages that you need for your workflow, provide comment with reason for including

### Functions

The following functions will be used in the workflow.

```python
def plot_corr(dataframe,size=10):                         # plots a correlation matrix as a heat map 
    corr = dataframe.corr()
    fig, ax = plt.subplots(figsize=(size, size))
    im = ax.matshow(corr,vmin = -1.0, vmax = 1.0)
    plt.xticks(range(len(corr.columns)), corr.columns);
    plt.yticks(range(len(corr.columns)), corr.columns);
    plt.colorbar(im, orientation = 'vertical')
    plt.title('Correlation Matrix')
```

**Guidance**: if you have blocks of code that are repeatedly used in the workflow, define a function for improved workflow compactness and readibility, provide a short comment on what it does

### Load Data

The following workflow applies the .csv file '300well_MV.csv', a synthetic dataset calculated with geostatistical cosimulation by Wayne Gretzky, The Edmonton Oilers Hockey Team. The dataset is publically available [here](http://www.hasthelargehadroncolliderdestroyedtheworldyet.com/)  

We will work with the following features:

* **porosity** - fraction of rock void in units of percentage
* **permeability** - ability of a fluid to flow through the rock in mil;iDarcy
* **acoustic impedence** - product of sonic velocity and rock density in unitsof $kg/m^2s*10^3$

```python
my_data = pd.read_csv(r"https://raw.githubusercontent.com/GeostatsGuy/GeoDataSets/master/unconv_MV.csv") # load the comma delimited data file from Dr. Pyrcz's GeoDataSets GitHub repository
my_data = my_data.iloc[:,6:8]                             # copy all rows and columns 1 through 8, note 0 column is removed
```

**Guidance**: for workflow clarity only pass forward the data and samples used in your workflow. If you make custom dataset by Monte Carlo simulation etc. do it here.

* load the data from the cloud, we must be able to load the data and reproduce your workflow. Data must be publically available. You **may not** submit data as a separate file with you final project and **will not** accept confidential data.
* list the feature name, explain the feature and provide the units (make sure all plots have units)
* include data preparation such as renaming and extracting features

### Name of My Workflow

A short summary of your workflow. This is a suggestion. The main thing is to be clear and concise. Easy to follow!

1. If helpful, you could include

2. enumeration

### 1. Name Your First Workflow Step

This is a short summary of this step.

Short Markdown block, code, concise and essential output summaries, repeat  

### 2. Name Your Second Workflow Step

Repeat as needed

### Results

Final summary of results, include table (DataFrame) and / plots, answer the problem and demonstrate the work stated in the executive summary.

### Parting Comments / Promote You

Consider adding any information to promote your capabilities, interest in internships, full-time positions. 

* This workflow will be shared / posted online and will promote you. 

* It is optional to retain my information below yours. This may also provide some ideas.



I hope this was helpful,

*You name*

___________________

#### Work Supervised by:

### Michael Pyrcz, Associate Professor, University of Texas at Austin 
*Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions*

With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development. 

For more about Michael check out these links:

#### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig)  | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)

#### Want to Work Together?

I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.

* Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you! 

* Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!

* I can be reached at mpyrcz@austin.utexas.edu.

I'm always happy to discuss,

*Michael*

Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin
