## **MACHINE LEARNING IN ARCHAEOLOGY:​ BASICS**
###**Deep Learning Feature Detection and Segmentation**

**Iban Berganzo-Besga** 1,2,3
(https://orcid.org/0000-0002-6161-2452)
iban.berganzo@bsc.es

1. **Senior Research Engineer​**

* *Computational Social Sciences and Humanities (CSSH)​  
Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS)*​

* https://bsc.es/berganzo-besga-iban

2. **Associate Researcher​**

* *Ramsey Laboratory for Environmental Archaeology (RLEA)​  
University of Toronto Mississauga (UTM)*​

* https://www.utm.utoronto.ca/ramsey-lab/people/iban-berganzo-besga

3. **Associate Researcher​**

* *Landscape Archaeology Research Group (GIAP)​  
Catalan Institute of Classical Archaeology (ICAC)​*

* https://icac.cat/en/who-are-we/staff/iberganzo/

### 0. Python Dependencies

Install libraries with specific versions

In [None]:
!pip uninstall -y Pillow
!pip install Pillow==9.5.0

In [None]:
import PIL
print(PIL.__version__)

In [None]:
!pip uninstall -y torch torchvision torchaudio
!pip install torch==2.3.0+cu121 torchvision==0.18.0+cu121 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121

In [None]:
import torch
print(torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("CUDA version:", torch.version.cuda)

### 1. Sources & Algorithms Selection

**Sources**: Historical Maps & Drone Imagery

**Algorithm**: Instance Segmentation (YOLOv9)

In [None]:
!git clone https://github.com/iberganzo/MLArchaeologyBasics.git

### 2. Data Pretreatment

**Image Format & Size**: JPG, 512x512 px

**Image Labelling**: YOLO format

https://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html

**Work Dataset**: Training (60%), validation (20%) and testing (20%)

Remove previous data

In [None]:
!rm MLArchaeologyBasics/yolov9/dataset/images/train/*
!rm MLArchaeologyBasics/yolov9/dataset/images/val/*
!rm MLArchaeologyBasics/yolov9/dataset/images/test/*
!rm MLArchaeologyBasics/yolov9/dataset/labels/train/*
!rm MLArchaeologyBasics/yolov9/dataset/labels/val/*
!rm MLArchaeologyBasics/yolov9/dataset/labels/test/*

!rm MLArchaeologyBasics/yolov9/dataset/train.txt
!rm MLArchaeologyBasics/yolov9/dataset/train.cache
!rm MLArchaeologyBasics/yolov9/dataset/val.txt
!rm MLArchaeologyBasics/yolov9/dataset/val.cache
!rm MLArchaeologyBasics/yolov9/dataset/test.txt
!rm MLArchaeologyBasics/yolov9/dataset/test.cache

!rm MLArchaeologyBasics/yolov9/labelling/yolo/*
!rm MLArchaeologyBasics/yolov9/labelling/via/*

Add your images to the MLArchaeologyBasics/yolov9/dataset/images/ folders

Create the TXT files associated with the train, val and test images in the MLArchaeologyBasics/yolov9/dataset/ folder

Add your labelled VIA JSON data to the MLArchaeologyBasics/yolov9/labelling/via/ folder

In [None]:
%cd MLArchaeologyBasics/yolov9/labelling/
!python3 viatoyolo.py --img 512 512
#!python3 viatoyolo.py --img 1024 1024
%cd ../../../

Download and add the resulting labelled YOLO TXT data to the MLArchaeologyBasics/yolov9/dataset/labels/ folders

### 3. Algorithm Training

**Hyperparameters**: Image Size (512x512px), Epochs (50), Batch Size (32)

**Training Data**: Transfer Learning

In [None]:
%cd MLArchaeologyBasics/yolov9/

In [None]:
!apt-get update
!apt-get install megatools
!megadl 'https://mega.nz/#!VJ9AASoQ!p2S9oeIKUGKncYXO4wQ682riWQYUIxjhToQiB77_uc8'

In [None]:
!python3 segment/train.py --workers 8 --device 0 --batch 32  --data data/train.yaml --img 512 --cfg models/segment/gelan-c-seg_custom.yaml --weights 'gelan-c-seg.pt' --hyp hyp.scratch-high.yaml --no-overlap --epochs 50 --close-mosaic 10 --name exercise_yolov9_seg_train

### 3. Algorithm Evaluation: Validation Dataset

**Recall**: TP / (TP + FN)

**Precision**: TP / (TP + FP)

**F1**: (2 * R * P) / (R + P)

In [None]:
!python3 segment/val.py --data data/val.yaml --img 512 --batch 32 --conf 0.001 --iou 0.5 --device 0 --weights 'runs/train-seg/exercise_yolov9_seg_train/weights/best.pt' --save-json --name exercise_yolov9_seg_val

In [None]:
import os
from IPython.display import display, Image
directory = '/content/MLArchaeologyBasics/yolov9/runs/val-seg/exercise_yolov9_seg_val/'
for filename in os.listdir(directory):
    if filename.endswith(".jpg") or filename.endswith(".jpeg"):
        display(Image(filename=os.path.join(directory, filename)))

### 4. Algorithm Evaluation: Testing Dataset

**Recall**: TP / (TP + FN)

**Precision**: TP / (TP + FP)

**F1**: (2 * R * P) / (R + P)

In [None]:
!python3 segment/val.py --data data/test.yaml --img 512 --batch 32 --conf 0.001 --iou 0.5 --device 0 --weights 'runs/train-seg/exercise_yolov9_seg_train/weights/best.pt' --save-json --name exercise_yolov9_seg_test

In [None]:
import os
from IPython.display import display, Image
directory = '/content/MLArchaeologyBasics/yolov9/runs/val-seg/exercise_yolov9_seg_test/'
for filename in os.listdir(directory):
    if filename.endswith(".jpg") or filename.endswith(".jpeg"):
        display(Image(filename=os.path.join(directory, filename)))

### 5. How to Improve the Algorithm

**Hyperparameters**: Increase Epoch Number

**Little Amount of Training Data**: Data Augmentation

**False Positives**: Negative Training & Filters

**ArchaeolDA**: https://github.com/iberganzo/ArchaeolDA

**VIAtoYOLO**: https://github.com/iberganzo/VIAtoYOLO

**Published Article**: https://doi.org/10.1038/s41598-023-38190-x & https://doi.org/10.1002/arp.1822