# Data preparation
---

**Author:** [rodoart](https://github.com/rodoart/)<br>
**Date created:** 2021/08/06<br>
**Last modified:** 2021/08/06<br>
**Description:** 
Using pretrained neural networks to segment images of rooms.

In [2]:
TYPE_OF_EXECUTION = 'colab'
# Options: 'alone' or 'normal'

## Libraries

In [3]:
import sys
import subprocess
import pkg_resources

required = {
    'dvc', 'dvc[gdrive]', 'gdown'
}
installed = {pkg.key for pkg in pkg_resources.working_set}
missing = required - installed

if missing:
    python = sys.executable
    subprocess.check_call([python, '-m', 'pip', 'install', *missing], stdout=subprocess.DEVNULL)


## GitHub

If you want to make `push` with GitHub you will need to make an rsa key and register it in the website. 

In [4]:
GITHUB_PULL_NEEDED = True

In [5]:
from os import chdir

In [6]:
if TYPE_OF_EXECUTION in ('alone', 'colab') and GITHUB_PULL_NEEDED:
  # The created key is moved to the folder where it is required.
  !mkdir -p /root/.ssh/
  !cp /content/id_rsa.pub /root/.ssh/id_rsa.pub 
  !cp /content/id_rsa /root/.ssh/id_rsa 
  # Permissions change for more security.
  !chmod 600 ~/.ssh/id_rsa
  !chmod 600 ~/.ssh/id_rsa.pub
  # Associate the key to GitHub.com
  !ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

  # Change them to your credentials.
  !git config --global user.email "rodoart@ciencias.unam.mx"
  !git config --global user.name "rodoart"

  # Clone the repo.
  !git clone git@github.com:rodoart/pet-surveillance.git

  # Change the remote to your branch.
  chdir('pet-surveillance')
  !git remote remove origin
  !git remote add origin git@github.com:rodoart/pet-surveillance.git

elif TYPE_OF_EXECUTION in ('alone', 'colab'):
  !git clone https://github.com/rodoart/pet-surveillance
  chdir('pet-surveillance')

else:
  chdir('..')

# github.com:22 SSH-2.0-babeld-831dd33d
Cloning into 'pet-surveillance'...
remote: Enumerating objects: 371, done.[K
remote: Counting objects: 100% (371/371), done.[K
remote: Compressing objects: 100% (231/231), done.[K
remote: Total 371 (delta 159), reused 333 (delta 121), pack-reused 0[K
Receiving objects: 100% (371/371), 9.83 MiB | 7.55 MiB/s, done.
Resolving deltas: 100% (159/159), done.


## Path function

In [7]:
import sys
sys.path.append('.')

In [8]:
from pet_surveillance.utils.paths import make_dir_function, is_valid

In [9]:
local_dir = make_dir_function()

## Prepare the dataset

### Download

For the development of this experiment, the data set of Unity Computer Vision Datasets (Home interior sample) is used, which contains 1,000 artificially created with different types of labeling, including semantic segmentation.

![Example from the dataset](https://content.cdntwrk.com/files/aHViPTEwODAwMiZjbWQ9aXRlbWVkaXRvcmltYWdlJmZpbGVuYW1lPWl0ZW1lZGl0b3JpbWFnZV82MGY5YmY3MGYxNmZkLnBuZyZ2ZXJzaW9uPTAwMDAmc2lnPTIzZTcwMzIwNzIzMGIyMzAwNmZkM2VhNGZiODFiYzkz)

Access to the dataset can be obtained directly on the provided page. The following download section is designed to work only for contributors to this project, but can be easily modified once you get the download link.

In [10]:
from pet_surveillance.data.make_dataset import DataDownload

In [11]:
data_object = DataDownload(workspace='')
data_object.start()
data_path = data_object.dataset_processed_path

Unity_Residential_Interiors.zip has been downloaded!
Unity_Residential_Interiors.zip has been unzipped to the directory /content/pet-surveillance/tmp/unity_residential_interiors!
The files have been moved or already exist.
The directory /content/pet-surveillance/data/processed/semantic_segmentation/unity_residential_interiors already existed and isn't empty!


## Commits and updates

In [12]:
!dvc add /content/pet-surveillance/data/raw/semantic_segmentation/unity_residential_interiors/images
!dvc add /content/pet-surveillance/data/raw/semantic_segmentation/unity_residential_interiors/images

[1;30;43mSe truncaron las últimas líneas 5000 del resultado de transmisión.[0m
!
  0%|          |7566e6793217ea6013868be6256082     0.00/? [00:00<?,        ?B/s]
7566e6793217ea6013868be6256082:   0% 0.00/3.08M [00:00<?, ?B/s{'info': ''}]     
                                                                           
!
  0%|          |49227342be8f93401ecd652659e602     0.00/? [00:00<?,        ?B/s]
49227342be8f93401ecd652659e602:   0% 0.00/2.84M [00:00<?, ?B/s{'info': ''}]     
                                                                           
!
  0%|          |6ae1588662ef5d0e829d2f29e5207c     0.00/? [00:00<?,        ?B/s]
6ae1588662ef5d0e829d2f29e5207c:   0% 0.00/2.45M [00:00<?, ?B/s{'info': ''}]     
                                                                           
!
  0%|          |f9df25a478cc7a72276407b6be14e2     0.00/? [00:00<?,        ?B/s]
f9df25a478cc7a72276407b6be14e2:   0% 0.00/2.85M [00:00<?, ?B/s{'info': ''}]     
                                   

In [13]:
!dvc push

Querying remote cache:   0% 0/1 [00:00<?, ?files/s{'info': ''}]Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?client_id=710796635688-iivsgbgsb6uv1fap6635dhvuei09o66c.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.appdata&access_type=offline&response_type=code&approval_prompt=force

Enter verification code: 4/1AdQt8qh5_ChnpGnLzc6Bt-SU0bZ7aY_wstTSZOXCQhbp8QFv9FL4bDC_6Io
Authentication successful.
Transferring:   0% 0/1 [00:00<?, ?file/s{'info': ''}]
!
  0%|          |daf84f2cf94c44e7eb7a8dc07abc2b.dir 0.00/? [00:00<?,        ?B/s]
daf84f2cf94c44e7eb7a8dc07abc2b.dir:   0% 0.00/6.63k [00:01<?, ?B/s{'info': ''}] 
100% 6.63k/6.63k [00:01<00:00, 3.45kB/s{'info': ''}]                           
Transferring:   0% 0/1 [00:00<?, ?file/s{'info': ''}]
!
  0%|          |3a00646ecfbe83c3b9cb6e78cf8f95.dir 0.00/? [00:00<?,        

In [14]:
!git add -A

In [17]:
!git commit -m "Added raw data."

[master 2f1ab81] Added raw data.
 2 files changed, 6 insertions(+)
 create mode 100644 data/raw/semantic_segmentation/unity_residential_interiors/.gitignore
 create mode 100644 data/raw/semantic_segmentation/unity_residential_interiors/images.dvc


In [18]:
!git pull origin master

From github.com:rodoart/pet-surveillance
 * branch            master     -> FETCH_HEAD
Already up to date.


In [19]:
!git push origin master

Counting objects: 8, done.
Delta compression using up to 2 threads.
Compressing objects:  14% (1/7)   Compressing objects:  28% (2/7)   Compressing objects:  42% (3/7)   Compressing objects:  57% (4/7)   Compressing objects:  71% (5/7)   Compressing objects:  85% (6/7)   Compressing objects: 100% (7/7)   Compressing objects: 100% (7/7), done.
Writing objects:  12% (1/8)   Writing objects:  25% (2/8)   Writing objects:  37% (3/8)   Writing objects:  50% (4/8)   Writing objects:  62% (5/8)   Writing objects:  75% (6/8)   Writing objects:  87% (7/8)   Writing objects: 100% (8/8)   Writing objects: 100% (8/8), 694 bytes | 694.00 KiB/s, done.
Total 8 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To github.com:rodoart/pet-surveillance.git
   e55711f..2f1ab81  master -> master
