<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Pre-requisites" data-toc-modified-id="Pre-requisites-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Pre-requisites</a></span></li><li><span><a href="#Instructions" data-toc-modified-id="Instructions-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Instructions</a></span></li><li><span><a href="#Imports-and-Constants" data-toc-modified-id="Imports-and-Constants-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Imports and Constants</a></span></li><li><span><a href="#Validate-and-Split-Exported-TFRecords" data-toc-modified-id="Validate-and-Split-Exported-TFRecords-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Validate and Split Exported TFRecords</a></span></li><li><span><a href="#Calculate-Mean-and-Std-Dev-for-Each-Band" data-toc-modified-id="Calculate-Mean-and-Std-Dev-for-Each-Band-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Calculate Mean and Std-Dev for Each Band</a></span></li></ul></div>

## Pre-requisites

Go through the [`preprocessing/0_export_tfrecords.ipynb`](./0_export_tfrecords.ipynb) notebook.

Before running this notebook, you should have the following structure under the `data/` directory:

```
data/
    dhs_tfrecords_raw/
        angola_2011_00.tfrecord.gz
        ...
        zimbabwe_2015_XX.tfrecord.gz
    dhsnl_tfrecords_raw/
        angola_2010_00.tfrecord.gz
        ...
        zimbabwe_2016_XX.tfrecord.gz
    lsms_tfrecords_raw/
        ethiopia_2011_00.tfrecord.gz
        ...
        uganda_2013_XX.tfrecord.gz
```

## Instructions

This notebook processes the exported TFRecords as follows:
1. Verifies that the fields in the TFRecords match the original CSV files.
2. Splits each monolithic TFRecord file exported from Google Earth Engine into one file per record.

After running this notebook, you should have three new folders (`dhs_tfrecords`, `dhsnl_tfrecords`, and `lsms_tfrecords`) under `data/`:

```
data/
    dhs_tfrecords/
        angola_2011/
            00000.tfrecord.gz
            ...
            00229.tfrecord.gz
        ...
        zimbabwe_2015/
            00000.tfrecord.gz
            ...
            00399.tfrecord.gz
    dhsnl_tfrecords/
        angola_2010/
            00000.tfrecord.gz
            ...
            07734.tfrecord.gz
        zimbabwe_2016/
            00000.tfrecord.gz
            ...
            03584.tfrecord.gz
    lsms_tfrecords/
        ethiopia_2011/
            00000.tfrecord.gz
            ...
            00326.tfrecord.gz
        uganda_2013/
            00000.tfrecord.gz
            ...
            00164.tfrecord.gz
```

This notebook also calculates the mean and standard deviation of each band across each of the 3 datasets.

## Imports and Constants

In [1]:
%load_ext autoreload
%autoreload 2


# change directory to repo root, and verify
%cd '../'
!pwd

/atlas/u/erikrozi/rural-urban-Modelbias
/atlas/u/erikrozi/rural-urban-Modelbias


In [51]:
from __future__ import annotations

from collections.abc import Iterable
from glob import glob
from pprint import pprint
import os
from typing import Optional

import numpy as np
import pandas as pd
import tensorflow as tf
from tqdm import tqdm
import re

from batchers import batcher, tfrecord_paths_utils
from preprocessing.helper import (
    analyze_tfrecord_batch,
    per_band_mean_std,
    print_analysis_results)

In [5]:
REQUIRED_BANDS = ['long','lat']
BUILDINGS_EXPORT_FOLDER ='/content/drive/MyDrive/dhs_geo'
BUILDINGS_PROCESSED_FOLDER='/content/drive/MyDrive/dhs_geo_processed'
#BUILDINGS_EXPORT_FOLDER = '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw'
#BUILDINGS_PROCESSED_FOLDER = '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings'

## Validate and Split Exported TFRecords

In [54]:
def process_dataset(csv_path: str, input_dir: str, processed_dir: str) -> None:
    '''
    Args
    - csv_path: str, path to CSV of DHS or LSMS clusters
    - input_dir: str, path to TFRecords exported from Google Earth Engine
    - processed_dir: str, folder where to save processed TFRecords
    '''
    df = pd.read_csv(csv_path, float_precision='high', index_col=False)
    surveys = list(df.groupby(['country', 'year']).groups.keys())  # (country, year) tuples
    for country, year in surveys:
        country_year = f'{country}_{year}'
        print('Processing:', country_year)
        tfrecord_paths = glob(os.path.join(input_dir, country_year + '*')).sort(key=lambda f: int(re.sub('\D', '', f)))
        out_dir = os.path.join(processed_dir, country_year)
        os.makedirs(out_dir, exist_ok=True)
        subset_df = df[(df['country'] == country) & (df['year'] == year)].reset_index(drop=True)
        validate_and_split_tfrecords(
            tfrecord_paths=tfrecord_paths, out_dir=out_dir, df=subset_df)
def validate_and_split_tfrecords(
        tfrecord_paths: Iterable[str],
        out_dir: str,
        df: pd.DataFrame
        ) -> None:
    '''Validates and splits a list of exported TFRecord files (for a
    given country-year survey) into individual TFrecords, one per cluster.
    "Validating" a TFRecord comprises of 2 parts
    1) verifying that it contains the required bands
    2) verifying that its other features match the values from the dataset CSV
    Args
    - tfrecord_paths: list of str, paths to exported TFRecords files
    - out_dir: str, path to dir to save processed individual TFRecords
    - df: pd.DataFrame, index is sequential and starts at 0
    '''
    print(tfrecord_paths)
    # Create an iterator over the TFRecords file. The iterator yields
    # the binary representations of Example messages as strings.
    options = tf.io.TFRecordOptions(tf.compat.v1.io.TFRecordCompressionType.GZIP)
    # cast float64 => float32 and str => bytes
    for col in df.columns:
        if df[col].dtype == np.float64:
            df[col] = df[col].astype(np.float32)
        elif df[col].dtype == object:  # pandas uses 'object' type for str
            df[col] = df[col].astype(bytes)
    i = 0
    progbar = tqdm(total=len(df))
    print(df[df['lat']==-10.556163787841797])
    for tfrecord_path in tfrecord_paths:
        iterator = tf.compat.v1.io.tf_record_iterator(tfrecord_path, options=options)
        for record_str in iterator:
            # parse into an actual Example message
            ex = tf.train.Example.FromString(record_str)
            feature_map = ex.features.feature
            # verify required bands exist
            for band in REQUIRED_BANDS:
                assert band in feature_map, f'Band "{band}" not in record {i} of {tfrecord_path}'
            # compare feature map values against CSV values
            csv_feats = df.loc[i, :].to_dict()
            for col, val in csv_feats.items():
                ft_type = feature_map[col].WhichOneof('kind')
                ex_val = feature_map[col].__getattribute__(ft_type).value[0]
                assert val == ex_val, f'Expected {col}={val}, but found {ex_val} instead'
            # serialize to string and write to file
            out_path = os.path.join(out_dir, f'{i:05d}.tfrecord.gz')  # all surveys have < 1e6 clusters
            with tf.io.TFRecordWriter(out_path, options=options) as writer:
                writer.write(ex.SerializeToString())
            i += 1
            progbar.update(1)
    progbar.close()

In [53]:
process_dataset(
    csv_path='/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_clusters.csv',
    input_dir=BUILDINGS_EXPORT_FOLDER,
    processed_dir=BUILDINGS_PROCESSED_FOLDER)

Processing: angola_2011
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_00.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_01.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_02.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_03.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_04.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_05.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_06.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_07.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2011_08.tfrecord.gz'

  0%|          | 1/230 [00:00<00:43,  5.25it/s]

      country  year        lat        lon      GID_1        GID_2  \
70  b'angola'  2011 -10.556164  22.039558  b'AGO.13'  b'AGO.13.3'   

    wealthpooled  households  urban_rural  
70     -1.128011          37            0  


100%|██████████| 230/230 [00:06<00:00, 34.35it/s]


Processing: angola_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/angola_2015_00.tfrecord.gz']


  1%|          | 7/625 [00:00<00:09, 61.89it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 625/625 [00:07<00:00, 83.31it/s] 


Processing: benin_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/benin_2012_00.tfrecord.gz']


  3%|▎         | 19/746 [00:00<00:04, 178.87it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 746/746 [00:07<00:00, 96.04it/s] 


Processing: burkina_faso_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/burkina_faso_2010_00.tfrecord.gz']


  1%|▏         | 8/541 [00:00<00:07, 75.69it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 541/541 [00:04<00:00, 111.46it/s]


Processing: burkina_faso_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/burkina_faso_2014_00.tfrecord.gz']


  7%|▋         | 17/248 [00:00<00:01, 167.87it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 248/248 [00:02<00:00, 111.94it/s]


Processing: cameroon_2011
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/cameroon_2011_00.tfrecord.gz']


  4%|▍         | 23/576 [00:00<00:02, 222.33it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 576/576 [00:02<00:00, 227.62it/s]


Processing: cote_d_ivoire_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/cote_d_ivoire_2012_00.tfrecord.gz']


  3%|▎         | 9/341 [00:00<00:05, 65.51it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 341/341 [00:03<00:00, 91.00it/s] 


Processing: democratic_republic_of_congo_2013
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/democratic_republic_of_congo_2013_00.tfrecord.gz']


  1%|▏         | 7/492 [00:00<00:07, 63.88it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 492/492 [00:04<00:00, 111.47it/s]


Processing: ethiopia_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/ethiopia_2010_00.tfrecord.gz']


  2%|▏         | 9/571 [00:00<00:06, 81.77it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 571/571 [00:05<00:00, 108.35it/s]


Processing: ethiopia_2016
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/ethiopia_2016_00.tfrecord.gz']


  2%|▏         | 10/622 [00:00<00:06, 96.36it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 622/622 [00:06<00:00, 94.69it/s] 


Processing: ghana_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/ghana_2014_00.tfrecord.gz']


  1%|          | 4/422 [00:00<00:10, 39.30it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 422/422 [00:05<00:00, 81.63it/s] 


Processing: ghana_2016
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/ghana_2016_00.tfrecord.gz']


  5%|▍         | 9/192 [00:00<00:02, 87.66it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 192/192 [00:02<00:00, 88.40it/s] 


Processing: guinea_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/guinea_2012_00.tfrecord.gz']


  3%|▎         | 8/300 [00:00<00:03, 76.95it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 300/300 [00:02<00:00, 104.71it/s]


Processing: kenya_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/kenya_2014_00.tfrecord.gz']


  0%|          | 3/1585 [00:00<01:11, 22.11it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 1585/1585 [00:15<00:00, 103.64it/s]


Processing: kenya_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/kenya_2015_00.tfrecord.gz']


  2%|▏         | 5/245 [00:00<00:04, 49.54it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 245/245 [00:02<00:00, 97.34it/s] 


Processing: lesotho_2009
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/lesotho_2009_00.tfrecord.gz']


  3%|▎         | 13/395 [00:00<00:03, 124.97it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 395/395 [00:02<00:00, 138.73it/s]


Processing: lesotho_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/lesotho_2014_00.tfrecord.gz']


  2%|▏         | 9/399 [00:00<00:04, 89.59it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 399/399 [00:03<00:00, 128.17it/s]


Processing: malawi_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/malawi_2010_00.tfrecord.gz']


  2%|▏         | 15/827 [00:00<00:05, 141.29it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 827/827 [00:07<00:00, 117.81it/s]


Processing: malawi_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/malawi_2012_00.tfrecord.gz']


 10%|█         | 14/140 [00:00<00:00, 136.38it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 140/140 [00:01<00:00, 98.50it/s] 


Processing: malawi_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/malawi_2014_00.tfrecord.gz']


 11%|█▏        | 16/140 [00:00<00:00, 152.92it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 140/140 [00:01<00:00, 105.58it/s]


Processing: malawi_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/malawi_2015_00.tfrecord.gz']


  2%|▏         | 18/850 [00:00<00:04, 172.53it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 850/850 [00:05<00:00, 142.40it/s]


Processing: mali_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mali_2012_00.tfrecord.gz']


  6%|▌         | 23/413 [00:00<00:01, 227.69it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 413/413 [00:01<00:00, 229.89it/s]


Processing: mali_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mali_2015_00.tfrecord.gz']


 13%|█▎        | 23/177 [00:00<00:00, 225.58it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 177/177 [00:00<00:00, 225.91it/s]


Processing: mozambique_2009
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2009_00.tfrecord.gz']


  3%|▎         | 9/270 [00:00<00:03, 83.99it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 270/270 [00:05<00:00, 45.82it/s] 


Processing: mozambique_2011
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_00.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_01.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_02.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_03.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_04.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_05.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_06.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/mozambique_2011_07.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_build

  0%|          | 1/609 [00:00<01:06,  9.12it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 609/609 [00:15<00:00, 38.08it/s]


Processing: nigeria_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2010_00.tfrecord.gz']


  0%|          | 0/239 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 239/239 [00:04<00:00, 52.30it/s] 


Processing: nigeria_2013
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_00.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_01.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_02.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_03.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_04.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_05.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_06.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_07.tfrecord.gz', '/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2013_08.tf

  1%|          | 6/889 [00:00<00:23, 37.50it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 889/889 [00:50<00:00, 17.52it/s]


Processing: nigeria_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/nigeria_2015_00.tfrecord.gz']


  0%|          | 0/322 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 322/322 [00:08<00:00, 38.96it/s] 


Processing: rwanda_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/rwanda_2010_00.tfrecord.gz']


  0%|          | 0/492 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 492/492 [00:04<00:00, 104.36it/s]


Processing: rwanda_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/rwanda_2014_00.tfrecord.gz']


  0%|          | 0/492 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 492/492 [00:04<00:00, 99.68it/s] 


Processing: senegal_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/senegal_2010_00.tfrecord.gz']


  0%|          | 0/385 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 385/385 [00:02<00:00, 128.63it/s]


Processing: senegal_2012
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/senegal_2012_00.tfrecord.gz']


  0%|          | 0/200 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 200/200 [00:02<00:00, 99.32it/s] 


Processing: sierra_leone_2013
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/sierra_leone_2013_00.tfrecord.gz']


  0%|          | 0/435 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 435/435 [00:03<00:00, 139.39it/s]


Processing: tanzania_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/tanzania_2010_00.tfrecord.gz']


  0%|          | 0/458 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 458/458 [00:04<00:00, 113.88it/s]


Processing: tanzania_2011
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/tanzania_2011_00.tfrecord.gz']


  0%|          | 0/573 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 573/573 [00:04<00:00, 121.29it/s]


Processing: tanzania_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/tanzania_2015_00.tfrecord.gz']


  0%|          | 0/608 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 608/608 [00:05<00:00, 114.66it/s]


Processing: togo_2013
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/togo_2013_00.tfrecord.gz']


  0%|          | 0/330 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 330/330 [00:03<00:00, 99.79it/s] 


Processing: uganda_2009
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/uganda_2009_00.tfrecord.gz']


  0%|          | 0/170 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 170/170 [00:01<00:00, 92.86it/s] 


Processing: uganda_2011
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/uganda_2011_00.tfrecord.gz']


  0%|          | 0/400 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 400/400 [00:03<00:00, 101.10it/s]


Processing: uganda_2014
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/uganda_2014_00.tfrecord.gz']


  0%|          | 0/208 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 208/208 [00:01<00:00, 105.90it/s]


Processing: zambia_2013
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/zambia_2013_00.tfrecord.gz']


  0%|          | 0/719 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 719/719 [00:05<00:00, 121.81it/s]


Processing: zimbabwe_2010
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/zimbabwe_2010_00.tfrecord.gz']


  0%|          | 0/393 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 393/393 [00:03<00:00, 117.85it/s]


Processing: zimbabwe_2015
None
['/atlas/u/erikrozi/bias_mitigation/africa_poverty_clean/data/dhs_buildings_raw/zimbabwe_2015_00.tfrecord.gz']


  0%|          | 0/400 [00:00<?, ?it/s]

Empty DataFrame
Columns: [country, year, lat, lon, GID_1, GID_2, wealthpooled, households, urban_rural]
Index: []


100%|██████████| 400/400 [00:03<00:00, 123.13it/s]
