# Mapping cultural remains with object detection
All forests in Sweden, both managed forests and natural old-growth forests in national parks, contain a cultural
heritage. The long history of forest utilization in Sweden has left a rich legacy of diverse types of ancient
monuments and other kinds of cultural remains that document our relationship with the forest and its importance
for Sweden’s development. However, the cultural heritage is too often damaged in forestry operations. The aim of
the project is to do research and develop operationally useful maps that can be used
to identify, protect and enhance the cultural remains in Swedish forests, thereby reducing the destruction of
cultural heritage in our forest landscapes.

In [3]:
!pip install rtree
!pip install torch
!pip install torchvision

Collecting rtree
  Downloading Rtree-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: rtree
Successfully installed rtree-1.0.0
You should consider upgrading via the '/usr/bin/python -m pip install --upgrade pip' command.[0m[33m
[0m

# Split charcoal labels


In [None]:
!python /workspace/code/tools/split_training_data.py /workspace/data/object_detection/segmentation_masks/charcoal_kilns/ /workspace/data/object_detection/split_segmentations_masks/charcoal_kilns --tile_size 250

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 0 files in the /workspace/data/object_detection/split_segmentations_masks/charcoal_kilns
New image name will start with 1
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 311.96img/s][0m
400 tiles sample of /workspace/data/object_detection/segmentation_masks/charcoal_kilns/18D022_67450_5775_25.tif are added at /workspace/data/object_detection/split_segmentations_masks/charcoal_kilns
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 400 files in the /workspace/data/object_detection/split_segmentations_masks/charcoal_kilns
New image name will start with 401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 314.87img/s][0m
400 tiles sample of /workspace/data/object_detection/segmentation_masks/charcoal_kilns/18D022

## Convert segmentation masks to bounding boxes

In [2]:
!python /workspace/code/object_detection/masks_to_boxes.py /workspace/temp/ /workspace/data/object_detection/split_segmentations_masks/hunting_pits/ 250 1 /workspace/data/object_detection/bounding_boxes/hunting_pits/

Traceback (most recent call last):
  File "/workspace/code/object_detection/masks_to_boxes.py", line 10, in <module>
    import pybboxes as pbx
ModuleNotFoundError: No module named 'pybboxes'


## Topographical modeling

### Select Laz tiles intersecting field data

In [2]:
!python /workspace/code/create_aoi_poolygon.py /workspace/lidar/none.shp /workspace/data/hunting_pits/Fangstgrop_training_Holmen_Cissi_695st_220214.shp /workspace/lidar/pooled_laz_files/ /workspace/data/hunting_pits/laz/ 

  main(**args)
Traceback (most recent call last):
  File "/workspace/code/create_aoi_poolygon.py", line 37, in <module>
    main(**args)
  File "/workspace/code/create_aoi_poolygon.py", line 25, in main
    copy_tiles(footprint, field_data, input_directory, output_directory)
  File "/workspace/code/create_aoi_poolygon.py", line 13, in copy_tiles
    intersect = gpd.sjoin(lidar_tiles_footprint, field, how='inner', op='intersects')
  File "/usr/local/lib/python3.8/dist-packages/geopandas/tools/sjoin.py", line 124, in sjoin
    indices = _geom_predicate_query(left_df, right_df, predicate)
  File "/usr/local/lib/python3.8/dist-packages/geopandas/tools/sjoin.py", line 216, in _geom_predicate_query
    sindex = right_df.sindex
  File "/usr/local/lib/python3.8/dist-packages/geopandas/base.py", line 2637, in sindex
    return self.geometry.values.sindex
  File "/usr/local/lib/python3.8/dist-packages/geopandas/array.py", line 292, in sindex
    self._sindex = _get_sindex_class()(

### Convert selected laz files to DEM

In [None]:
!python /workspace/code/laz_to_dem.py /workspace/data/hunting_pits/laz/ /workspace/data/hunting_pits/dem_tiles/

### Extract topographical indices

In [None]:
!python /workspace/code/Extract_topographcical_indices.py /workspace/temp/ /workspace/data/hunting_pits/laz/ /workspace/data/hunting_pits/topographical_indices_normalized/hillshade/ /workspace/data/hunting_pits/topographical_indices_normalized/slope/ /workspace/data/hunting_pits/topographical_indices_normalized/hpmf/ /workspace/data/hunting_pits/topographical_indices_normalized/stdon/

## Labels

**Hunting pits integer masks**

In [None]:
!python /workspace/code/create_labels.py /workspace/data/hunting_pits/dem_tiles/ /workspace/data/hunting_pits/hunting_pits.shp /workspace/data/hunting_pits/object_detection_data/label_tiles/

All data were split into image chips with the size 256x256. Note that the directories needs to be empty before running the split script. I found it esiest to recreate the directories to avoid errors.

In [132]:
# Start by clearing directories of existing data
import os

#shutil.rmtree('/workspace/data/split_data/') #this fails alot to it manually
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/')
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/labels/')
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/slope/')
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/hillshade/')
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/hpmf/')
os.mkdir('/workspace/data/hunting_pits/object_detection_data/split_data/stdon/')  

# Split data
# Hillshade 
!python /workspace/code/split_training_data.py /workspace/data/hunting_pits/topographical_indices_normalized/hillshade/ /workspace/data/hunting_pits/object_detection_data/split_data/hillshade/ --tile_size 256
# Slope
!python /workspace/code/split_training_data.py /workspace/data/hunting_pits/topographical_indices_normalized/slope/ /workspace/data/hunting_pits/object_detection_data/split_data/slope/ --tile_size 256
# High pass median filter
!python /workspace/code/split_training_data.py /workspace/data/hunting_pits/topographical_indices_normalized/hpmf/ /workspace/data/hunting_pits/object_detection_data/split_data/hpmf/ --tile_size 256
# High pass median filter
!python /workspace/code/split_training_data.py /workspace/data/hunting_pits/topographical_indices_normalized/stdon/ /workspace/data/hunting_pits/object_detection_data/split_data/stdon/ --tile_size 256
# Labels
!python /workspace/code/split_training_data.py /workspace/data/hunting_pits/object_detection_data/label_tiles/ /workspace/data/hunting_pits/object_detection_data/split_data/labels/ --tile_size 256

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 0 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 1
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 316.22img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/18D022_67450_5775_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 400 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 321.47img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/18D022_67475_5750_25.tif a

Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 6000 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 6001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 342.19img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19F047_71050_7225_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 6400 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 6401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 316.94img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19F047_71075_7200_25.tif are added at /workspace/data/hunting_pits/object_detection_data/spli

There are 12000 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 12001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 333.07img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19G013_71500_7025_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 12400 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 12401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 307.28img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19G013_71525_7000_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W)

There are 18000 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 18001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 345.03img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19G013_71675_7025_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 18400 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 18401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 327.47img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/19G013_71675_7050_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W)

There are 24000 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 24001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 343.57img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/20E018_68775_4800_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=256, stride=256
Padding Image File Shape (D, H, W):(1, 5120, 5120)
There are 24400 files in the /workspace/data/hunting_pits/object_detection_data/split_data/labels/
New image name will start with 24401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 336.45img/s][0m
400 tiles sample of /workspace/data/hunting_pits/object_detection_data/label_tiles/20E018_68800_4750_25.tif are added at /workspace/data/hunting_pits/object_detection_data/split_data/labels/
Input Image File Shape (D, H, W)

Not all of the splited image chips contained any objects. Chips with less than 1 labeled pixel were removed.

In [133]:
!python /workspace/code/remove_unlabled_chips.py 1 /workspace/data/hunting_pits/object_detection_data/split_data/labels/ /workspace/data/hunting_pits/object_detection_data/split_data/hillshade/ /workspace/data/hunting_pits/object_detection_data/split_data/slope/ /workspace/data/hunting_pits/object_detection_data/split_data/hpmf/ /workspace/data/hunting_pits/object_detection_data/split_data/stdon/

**Hunting pits bounding boxes**\
The segmentation masks were converted to yolo labels for object detection

In [21]:
!pip install pybboxes

Collecting pybboxes
  Downloading pybboxes-0.0.2-py3-none-any.whl (11 kB)
Installing collected packages: pybboxes
Successfully installed pybboxes-0.0.2
You should consider upgrading via the '/usr/bin/python -m pip install --upgrade pip' command.[0m[33m
[0m

Convert segmentation masks to YOLO bounding boxes

In [135]:
!python /workspace/code/masks_to_boxes.py /workspace/temp/ /workspace/data/hunting_pits/object_detection_data/split_data/labels/ 256 1 /workspace/data/hunting_pits/object_detection_data/split_data/yolo/ 

# YOLOv5

**partition the dataset into train, validation, and test sets containing 80%, 10%, and 10% of the data, respectively.**

In [165]:
# Read images and annotations
from sklearn.model_selection import train_test_split
images = [os.path.join('/workspace/data/hunting_pits/object_detection_data/split_data/stdon/', x) for x in os.listdir('/workspace/data/hunting_pits/object_detection_data/split_data/stdon/') if x.endswith('.tif')]
annotations = [os.path.join('/workspace/data/hunting_pits/object_detection_data/split_data/yolo/', x) for x in os.listdir('/workspace/data/hunting_pits/object_detection_data/split_data/yolo/') if x[-3:] == "txt"]

images.sort()
annotations.sort()

# Split the dataset into train-valid-test splits 
train_images, val_images, train_annotations, val_annotations = train_test_split(images, annotations, test_size = 0.2, random_state = 1)
val_images, test_images, val_annotations, test_annotations = train_test_split(val_images, val_annotations, test_size = 0.5, random_state = 1)
!mkdir /workspace/data/hunting_pits/object_detection_data/images/
!mkdir /workspace/data/hunting_pits/object_detection_data/annotations/
!mkdir /workspace/data/hunting_pits/object_detection_data/images/train /workspace/data/hunting_pits/object_detection_data/images/val /workspace/data/hunting_pits/object_detection_data/images/test /workspace/data/hunting_pits/object_detection_data/annotations/train /workspace/data/hunting_pits/object_detection_data/annotations/val /workspace/data/hunting_pits/object_detection_data/annotations/test

In [114]:
!git clone https://github.com/ivder/YoloBBoxChecker.git

Cloning into 'YoloBBoxChecker'...
remote: Enumerating objects: 21, done.[K
remote: Total 21 (delta 0), reused 0 (delta 0), pack-reused 21[K
Unpacking objects: 100% (21/21), 5.94 KiB | 74.00 KiB/s, done.


In [115]:
!ls

 Dockerfile
 Extract_topographcical_indices.py
 LICENSE
'Laz to DEM.ipynb'
'Mapping cultural remains with object detection.ipynb'
 README.md
 Select_chips_with_labels.py
 Select_study_areas.py
 Untitled1.ipynb
 Untitled2.ipynb
'Williams notes.ipynb'
 YoloBBoxChecker
 __pycache__
 create_aoi_poolygon.py
 create_labels.py
 data
 evaluate_model.py
 images
 inference.py
 inspect_distribution.py
 laz_to_dem.py
 lidar_tile_footprint.py
 masks_to_boxes.py
 post_processing.py
 prepare_the_moon.py
 remove_unlabled_chips.py
 select_laz_tiles.py
 select_lidar_tiles.py
 split_training_data.py
 train.py
 utils


In [151]:
!python /workspace/code/YoloBBoxChecker/main.py

Input:/workspace/data/hunting_pits/object_detection_data/8bitimage/0022.txt
Output:/workspace/data/hunting_pits/object_detection_data/results/0022.tif


compare lists of bounding boxes

In [51]:
import glob
import os
boxdir1 = '/workspace/data/object_detection/bounding_boxes/hunting_pits/'
boxdir2 = '/workspace/data/object_detection/bounding_boxes/charcoal_kilns/'

list1 = [os.path.basename(x) for x in glob.glob(boxdir1)]
list2 = [os.path.basename(path) for path in glob.glob(boxdir2)]

#list1 = map(os.path.basename, glob.glob(boxdir1))
#list2 = map(os.path.basename, glob.glob(boxdir2))

#set(temp1) ^ set(temp2)
for item in list1:
    print(item)
print(list1)



['']


In [60]:
def compare(list1, list2):
    for i in range(len(list1)):
        for j in range(len(list2)):
            if list1[i] == list2[j]:
                print(i)

boxdir1 = '/workspace/data/object_detection/bounding_boxes/hunting_pits/'
boxdir2 = '/workspace/data/object_detection/bounding_boxes/charcoal_kilns/'

list1 = []
list2 = []

for csv in os.listdir(boxdir1):
    if csv.endswith('.txt'):
        list1.append(csv)

for csv in os.listdir(boxdir2):
    if csv.endswith('.txt'):
        list2.append(csv)

compare(list1, list2)
#print(common)

16
17
38
232
343
381
382
383
384
523
624
701
702


In [None]:
!python /workspace/code/tools/split_training_data.py /workspace/data/topographical_indices_normalized/hillshade/ /workspace/data/object_detection/split_hillshade/ --tile_size 250

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 0 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 1
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 488.80img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/18D022_67450_5775_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 498.48img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/18D022_67475_5750_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input 

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 486.11img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19A017_62475_5225_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 6800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 6801
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 481.98img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19A017_62500_5225_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 7200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 720

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 13200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 13201
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 462.68img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19A019_62725_5650_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 13600 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 13601
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 478.20img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19A019_62725_5675_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 470.78img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71025_7025_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 20000 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 20001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 484.54img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71025_7075_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 20400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 26400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 26401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 477.60img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71125_7025_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 26800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 26801
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 482.65img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71125_7075_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 477.89img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71475_7125_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 33200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 33201
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 473.82img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F047_71475_7150_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 33600 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 39600 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 39601
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 477.17img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F048_71300_7325_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 40000 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 40001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 474.29img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F048_71300_7475_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 479.12img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F048_71425_7425_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 46400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 46401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 493.04img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19F048_71425_7450_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 46800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 52800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 52801
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 483.08img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71525_7150_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 53200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 53201
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 479.97img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71525_7175_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 453.59img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71575_7200_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 59600 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 59601
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 483.82img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71575_7225_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 60000 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 66000 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 66001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 477.83img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71650_7075_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 66400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 66401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 471.49img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71650_7100_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 478.61img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71725_7025_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 72800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 72801
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 494.43img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71725_7050_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 73200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 79200 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 79201
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 485.34img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71825_7100_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 79600 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 79601
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 470.00img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/19G013_71850_7100_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 498.78img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/20C012_64500_5500_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 86000 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 86001
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 480.62img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/20C012_64500_5675_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 86400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 

Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 92400 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 92401
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 480.75img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/20C027_65625_6025_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)
There are 92800 files in the /workspace/data/object_detection/split_hillshade/
New image name will start with 92801
Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:00<00:00, 465.47img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/20C027_65625_6050_25.tif are added at /workspace/data/object_detection/split_hills

Generating: 100%|[32m███████████████████████████[0m| 400/400 [00:01<00:00, 215.51img/s][0m
400 tiles sample of /workspace/data/topographical_indices_normalized/hillshade/20D019_67050_4825_25.tif are added at /workspace/data/object_detection/split_hillshade/
Input Image File Shape (D, H, W):(1, 5000, 5000)
crop_size=250, stride=250
Padding Image File Shape (D, H, W):(1, 5000, 5000)


In [64]:
uniqeitems = list(set(list1+list2))
print(len(uniqeitems))

2675


In [48]:
difference = set(os.path.basename(x) for x in os.listdir(boxdir1)) ^ set(os.path.basename(x) for x in os.listdir(boxdir2))

In [50]:
print(difference)

{'49972.txt', '3611.txt', '108127.txt', '53372.txt', '128116.txt', '63623.txt', '128042.txt', '33669.txt', '6026.txt', '125768.txt', '62758.txt', '57719.txt', '18624.txt', '94196.txt', '20942.txt', '124687.txt', '3921.txt', '85944.txt', '115054.txt', '87992.txt', '129363.txt', '59606.txt', '68559.txt', '27446.txt', '120153.txt', '3662.txt', '118524.txt', '29912.txt', '100287.txt', '136387.txt', '17229.txt', '34543.txt', '66901.txt', '136914.txt', '128131.txt', '126803.txt', '52879.txt', '126885.txt', '106724.txt', '39214.txt', '24665.txt', '37538.txt', '114925.txt', '16207.txt', '58409.txt', '13884.txt', '29253.txt', '4015.txt', '66849.txt', '102667.txt', '57053.txt', '71174.txt', '56606.txt', '126902.txt', '128146.txt', '95794.txt', '66867.txt', '5341.txt', '126942.txt', '102191.txt', '64677.txt', '109062.txt', '129588.txt', '72435.txt', '49831.txt', '25855.txt', '58817.txt', '86232.txt', '64935.txt', '113151.txt', '101930.txt', '34583.txt', '126822.txt', '49796.txt', '2552.txt', '328

In [136]:
import os
import csv

def merge_files(file_one, file_two, merged_file):
	content = ''
	with open(file_one, 'r') as reader:
		content += reader.read()
	with open(file_two, 'r') as reader:
		content += reader.read()
	with open(merged_file, 'w') as writer: 
		writer.write(content)
	#with open(merged_file, 'w') as file:        
		#writer = csv.writer(file, delimiter=' ',escapechar=' ', quoting=csv.QUOTE_NONE)
		#writer.writerow(content)

	return content

boxdir1 = '/workspace/data/object_detection/bounding_boxes/hunting_pits/'
boxdir2 = '/workspace/data/object_detection/bounding_boxes/charcoal_kilns/'
mergeddir = '/workspace/data/object_detection/bounding_boxes/merged_charcoal_hunting'

files_boxdir1 = os.listdir(boxdir1)
files_boxdir2 = os.listdir(boxdir2)

boxdir1_content = ''

for f in files_boxdir1:
	if f in files_boxdir2:
		boxdir1_content += merge_files(os.path.join(boxdir1, f), os.path.join(boxdir2, f), os.path.join(mergeddir, f))
	else:
		with open(os.path.join(boxdir1, f), 'r') as reader:
			boxdir1_content += reader.read()
       

boxdir2_content = ''
for f in files_boxdir2:
	if f in files_boxdir1:
		boxdir2_content += merge_files(os.path.join(boxdir1, f), os.path.join(boxdir2, f), os.path.join(mergeddir, f))
	else:
		with open(os.path.join(boxdir2, f), 'r') as reader:
			boxdir2_content += reader.read()
            


In [145]:
!python /workspace/code/object_detection/merge_bounding_boxes.py /workspace/data/object_detection/bounding_boxes/hunting_pits/ /workspace/data/object_detection/bounding_boxes/charcoal_kilns/ /workspace/data/object_detection/bounding_boxes/merged_charcoal_hunting/

dir1 files = 1957
dir2 files = 731
tot = 2688
overlapping files = 13 
expected = 2675
copied files = 

In [154]:
!python /workspace/code/object_detection/select_chips_with_labels.py /workspace/data/object_detection/bounding_boxes/merged_charcoal_hunting/ /workspace/data/object_detection/split_hillshade/ /workspace/data/object_detection/bounding_boxes/topographical_indices/hillshade/

In [156]:
!python /workspace/code/object_detection/partition_YOLO_data.py /workspace/data/object_detection/bounding_boxes/topographical_indices/hillshade/ /workspace/data/object_detection/bounding_boxes/merged_charcoal_hunting/ /workspace/data/object_detection/bounding_boxes/partitioned_data/train_hillshade /workspace/data/object_detection/bounding_boxes/partitioned_data/val_hillshade /workspace/data/object_detection/bounding_boxes/partitioned_data/test_hillshade /workspace/data/object_detection/bounding_boxes/partitioned_data/train_boxes /workspace/data/object_detection/bounding_boxes/partitioned_data/val_boxes /workspace/data/object_detection/bounding_boxes/partitioned_data/test_boxes

## Inspect anotations