In [1]:
import requests
from IPython.display import Markdown

url = 'https://kata.geosci.ai/challenge/prospecting'

r = requests.get(url)
print('Status', r.status_code)

Markdown(r.text)

Status 200


# Prospecting

We have 5 arrays of 4096 elements each. Each array represents a map as a 'raster' with 64 &times; 64 = 4096 pixels, and is given as a row in the dataset. Each pixel is represented by a single integer, taking values from 0 to 8.

The maps represent different things. In order, they are:

1. Reliability of well data.
2. Reliability of seismic data.
3. Porosity from wells and conceptual models.
4. Fracture density from wells and seismic.
5. Our land position (1 denotes 'our land').

We need to answer the following questions:

1. How many pixels have zero total reliability?
2. How many pixels are predicted to have better than 50th percentile (P50) porosity and better than P50 fracture density?
3. How many of these pixels have non-zero reliability and are on our land? These blobs are our _prospects_.
4. Find the product of the (x, y) coordinates of the cell containing the centre of mass of the largest _prospect_ blob.

For question 4, a centre of mass at (3.4, 12.6) is in the cell (3, 12) and you would respond with 3 &times; 12 = **36**.

We'll consider blobs to be connected if they have directly neighbouring pixels. In example A, below, there are 3 'blobs' of one pixel each. In example B there are 2 blobs, each with three pixels.

      A        B
    1 0 1    1 1 0
    0 1 0    1 0 1
    0 0 0    0 1 1


## Example

Here is a dataset of smaller maps. Every row represents a map, each 3 &times 3 pixels:

    example = """0,1,0,1,2,1,0,1,0
                 2,1,0,1,1,1,0,1,0
                 0,1,2,1,3,1,1,2,2
                 0,2,1,2,3,1,1,3,2
                 1,1,1,1,1,1,0,0,0"""

If we re-shaped each row to make a 3 &times; 3 map, the maps would look like:

      1       2       3       4       5    <--- map number
    0 1 0   2 1 0   0 1 2   0 2 1   1 1 1
    1 2 1   1 1 1   1 3 1   2 3 1   1 1 1
    0 1 0   0 1 0   1 2 2   1 3 2   0 0 0   

Here's how we might answer the questions:

1. There are **3** pixels with zeros in both of the reliability maps (the first two maps).
1. The P50 values on maps 3 and 4 are 1 and 2 respectively. There are **2** pixels that are higher on both maps.
1. Of those pixels, **1** has non-zero reliability and is on our land (map 5).
1. The coordinates of that pixels are (1, 1) so the product of those coordinates is **1**.


## Hints

It's likely that the `scipy.ndimage.measurements` module will be useful in answering question 4. For example, if you have an array `arr` like:

    0 1 1
    0 0 0
    1 0 0

Then `scipy.ndimage.measurements.label()` will return two things: the labels and the number 2 (meaning it found 2 objects). The labels have the same shape as the original 'map':

    0 1 1   <--- 1 denotes 'object 1'
    0 0 0   <--- 0 denotes 'background', i.e. no objects
    2 0 0   <--- 2 denotes 'object 2'

Once you have labels, you can get the centre of mass of the object labelled `3` with `scipy.ndimage.measurements.center_of_mass(arr, labels, 3)`.

We will use the default behaviour of the `scipy.ndimage.measurements.label()` function to decide if things are separate objects.


## A quick reminder how this works

You can retrieve your data by choosing any Python string as a **`<KEY>`** and substituting here:
    
    https://kata.geosci.ai/challenge/prospecting?key=<KEY>
                                                     ^^^^^
                                                     use your own string here

To answer question 1, make a request like:

    https://kata.geosci.ai/challenge/prospecting?key=<KEY>&question=1&answer=1234
                                                     ^^^^^          ^        ^^^^
                                                     your key       Q        your answer

[Complete instructions at kata.geosci.ai](https://kata.geosci.ai/challenge)

----

© 2020 Agile Scientific, licensed CC-BY

In [2]:
my_key = "scibbatical"

params = {'key': my_key}

r = requests.get(url, params)

# Look at the first bit of the input:
r.text[:100]

'000055_Sorbas_Yesaers_C_2000-01-01_xM\n000057_Sorbas_Yesares_H_2000-01-01_PTM\n000058_Sorbas_Yesares_H'