Skip to content

Latest commit

 

History

History
103 lines (56 loc) · 5.84 KB

c+j2019.md

File metadata and controls

103 lines (56 loc) · 5.84 KB

How we used machine learning to search satellite images of Nothern-West regions of Ukraine (70,000 km², ~ roughly the area of Czech republic)) for places with illegal amber mining.

Motivation

In 2010 world prices for amber started to surge. Due to this in 2012 demand was so high that north-western part of Ukraine became place of "amber rush" or "new Wild West". Thousands of prospectors starts to search for gems with shovels and later with water pumps. Hundreds of hectares in forests / agricultural land became a desert, a lifeless moon landscape. 2014-2016 were most intence years of illegal mining, but it's still going right now. How it looks from the drone:

One of the places of extraction

We decided to estimate, for the first time, the scale of environmental impact from this phenomenon.

Project stages

1.

Research how patches of land with illegal mining could look on satellite images (look at images for some known locations with amber mining, use known videos and photos, make interviews with field experts)

Some examples of amber mining patterns

 

2.

Find which map providers have relatively recent satellite images with good resolution. Find examples of places with mining on such images (Mostly it's Bing Maps by Microsoft. It has very generous API, for example it provides needed metadata such as a date for each image. Due to small characteristic size of digged holes, we needed resolution no less then 1m per pixel).

Tiles distribution, by year:

Year 2011 2012 2014 2015 2016
Number of images 1933 4669 117059 271893 55403
 

3.

Find and compile initial set of coordinates for images with traces of mining (we found ~50 such first places with a huge help from participants of Open Data Day Kyiv )

 

4.

Split each such tile to superpixels/segments (part of image with approximately homogeneous visual appearance) With simple linear iterative clustering (SLIC) algorithm (http://www.kev-smith.com/papers/SLIC_Superpixels.pdf)

Superpixels/Segments

 

5.

Use neural net to extract features for each superpixel. We used transfer learning with a pretrained, vanilla ResNet50 from exellent Keras library by François Chollet and others)

 

6.

Create labelled set of superpixels for binary classifier (split images on two sets - with traces of amber mining, and without such traces). Some interesting examples of "false positives", i.e. images which are similar to amber mining. Interesting examples of non-amber class

 

7.

Create machine model to classify superpixels (XGBoost was choosen due to best performance, after several attempts (SVM, RandomForest)). Estimate performance (production model: f1=0.91, recall=0.88, precision=0.95. We cared just about a lower bound of estimation and about false positives, so high precision is a must. ) Make (visual debugging) of classifier with interactive scatterplot.

 

8.

Apply steps 4,5 and classifier from step 7 to each superpixel/segment for all images from region of interest. If more then 2 superpixels classified as positive (note that we halved false positive error here), then mark current image as area of mining. We processed approximately 450,000 images from region with total area about 70,000 km², total computation time was ~100 hours on one computer with two GeForce GTX 960 onboard Detection process illustrated

 

9.

Our readers (local citizens from region) pointed to whole class of errors caused by some specific places with deforestation which we used as positive examples. We removed all such places from training dataset, retrained a model, reprocessed all images and published new version of map after additional review from field experts (took 2 weeks).
Deforestation,but not by amber mining

 

10.

Create interactive map with places found by our method ( Most active period of mining was during 2014-2016 years. Most tiles from maps dated by 2015. We found more then 1,000 hectares of damaged land

debug the tiles

 

Result: we present most detailed (as for this moment), interactive map of impact on environment due to illegal amber mining in Ukraine (in English)

 

Python notebooks