# Crop Labeled Images based on Depth

There are roughly 600 labeled panos with 4 images each

Other context information
* Pitch is always 0
* Heading is 45 / 135 / 225 / 315 based on the image filename
* photg_heading is the same as heading because our pitch is 0

In [1]:
import pandas as pd
import os
import s3fs # for reading from S3FileSystem
import json # for working with JSON files 

## Examine Depth Metadata

In [2]:
SAGEMAKER_PATH = r'/home/ec2-user/SageMaker'
PATH_META = os.path.join(SAGEMAKER_PATH, r'EDA/Formatting_Data/data/meta_with_depth.csv')
df_meta = pd.read_csv(PATH_META)
print(df_meta.shape)
df_meta.head()

(19022, 9)


Unnamed: 0.1,Unnamed: 0,date,lat,long,pano_id,name,pano_yaw_deg,tilt_yaw_deg,tilt_pitch_deg
0,0,2019-06,42.957503,-87.938367,XPRpjNDhowVo8zvqvSU1CA,1,91.979996,125.04,0.83
1,1,2016-10,42.899259,-88.047098,iEyn0apLSZvl4i4alUbfcA,10,145.81999,-178.26999,1.06
2,2,2011-08,42.921614,-87.881025,1BzC3WoFeJ8U1aUT9Hx8mg,100,359.18,30.769999,2.55
3,3,2018-09,43.050123,-88.040263,oRN5vilebPS0srDXRPylzw,1000,161.11,67.549995,1.57
4,4,2019-05,42.959289,-88.026043,KimNSirhP1TzngZkSpc8UA,10000,270.06,-114.81,1.35


In [3]:
# Describe the 3 main degree measurements
deg_cols = ['pano_yaw_deg', 'tilt_yaw_deg', 'tilt_pitch_deg']
df_meta[deg_cols].describe()

Unnamed: 0,pano_yaw_deg,tilt_yaw_deg,tilt_pitch_deg
count,18985.0,18985.0,18985.0
mean,173.902204,-7.779139,1.778432
std,105.863417,103.346355,1.298954
min,0.0,-179.97,0.02
25%,89.81,-97.54,0.96
50%,180.06,-12.889999,1.52
75%,269.75,79.06,2.22
max,360.0,179.95999,11.88


## Observations
* Would expect the depth to vary based on position in the image 

RECALL - Depth.txt (this S3 bucket - https://s3.console.aws.amazon.com/s3/buckets/gsv-depths/depth_txt/?region=us-east-2&tab=overview) contains the relevant info

# Looking at Txt Data

Image 1908 is of interest

In [12]:
df_1908 = df_meta.loc[df_meta['name'] == 1908]

df_1908

Unnamed: 0.1,Unnamed: 0,date,lat,long,pano_id,name,pano_yaw_deg,tilt_yaw_deg,tilt_pitch_deg
9724,9983,2018-09,43.002918,-87.904091,7yGvCWd2veNF4vkmASMt9A,1908,95.689995,118.5,0.82


In [6]:
# Load and view the images
fs = s3fs.S3FileSystem()

# Docs on s3fs -https://s3fs.readthedocs.io/en/latest/

s3_depth_bucket = 's3://gsv-depths'
depth_txt_dir = os.path.join(s3_depth_bucket, 'depth_txt')

# See what is in the folder
list_txt_dir = fs.ls(depth_txt_dir)
print(len(list_txt_dir))

19127


In [25]:
pano_id_1908 = df_1908['pano_id'].values

pano_id_1908

array(['7yGvCWd2veNF4vkmASMt9A'], dtype=object)

In [29]:
pano_id_1908_str = '7yGvCWd2veNF4vkmASMt9A'
txt_path_1908 = [filename for filename in list_txt_dir if pano_id_1908_str in filename][0]
txt_path_1908

'gsv-depths/depth_txt/7yGvCWd2veNF4vkmASMt9A.txt'

In [30]:
with fs.open(txt_path_1908) as file:
    depth_txt = file.read()
len(depth_txt)

2889660

In [31]:
depth_txt[0:300]

b'NaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\tNaN\t'