modules.extract.ExtractTable
=======================

`ExtractTable` is a `class` in module `extract`, which provides methods for extracting data from tabular data sources.

---

__Examples Setup__

The following commands are used for setting up the examples below. 

*Note:* The example input files were pulled and converted from the GeoJSON [link](http://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson) provided in the [geopandas IO docs](https://geopandas.org/io.html).

In [None]:
# Jupyter Notebook Setup
!pip install -r ../../../../requirements.txt > /dev/null

In [None]:
# Module Installation
!git clone https://github.com/KeiferC/gdutils.git > /dev/null
!pip install -r gdutils/requirements.txt > /dev/null

In [None]:
# Sets path to module
import os
import sys
import importlib
sys.path.insert(0, os.path.abspath('.'))

In [None]:
from gdutils.modules.extract import ExtractTable # import class from module

import geopandas as gpd
import pandas as pd

---

Example 1. Extract a table 
-------------------------------

*Note*: returns a ``geopandas GeoDataFrame``

__Example 1.1.__ Extract a table from a file


- Example 1.1.1. Extract from a shapefile

In [None]:
# Ex. 1.1.1

shp_path = './example-inputs/example-shp/example.shp' # path to file containing table to extract
shp_et = ExtractTable(shp_path) # alternative: ExtractTable.read_file(filepath)
shp_gdf = shp_et.extract() # extracts table as a geopandas GeoDataframe

shp_gdf.head() # renders first 5 rows of table

- Example 1.1.2. Extract from a CSV

In [None]:
# Ex. 1.1.2

csv_path = './example-inputs/example.csv'
csv_et = ExtractTable.read_file(csv_path) # using alternative
csv_gdf = csv_et.extract()

csv_gdf.head()

- Example 1.1.3. Extract from an Excel file

In [None]:
# Ex. 1.1.3

excel_path = './example-inputs/example.csv'
excel_gdf = ExtractTable.read_file(excel_path).extract() # shorthand equivalent

excel_gdf.head()

- Example 1.1.4. Extract from a ZIP file

In [None]:
# Ex. 1.1.4

zip_path = 'example-inputs/example.zip'
zip_gdf = ExtractTable.read_file(zip_path).extract()

zip_gdf.head()

__Example 1.2.__ Extract a table from a URL

In [None]:
# Ex. 1.2

url = 'http://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson' 
    # URL copied from https://geopandas.org/io.html
url_gdf = ExtractTable(url).extract()

url_gdf.head()

__Example 1.3.__ Extract a table from a `pandas DataFrame`

In [None]:
# Ex. 1.3

pandas_df = pd.read_csv(csv_path)
pandas_gdf = ExtractTable(pandas_df).extract()

pandas_gdf.head()

__Example 1.4.__ Extract a table from a `geopandas GeoDataFrame`

In [None]:
# Ex. 1.4

geopandas_gdf = ExtractTable(csv_gdf).extract()

geopandas_gdf.head()

Example 2. Extract a table with a selected index
-------------------------------------------------------------------------

__Example 2.1.__ Extract a table with a known column label as the index

In [None]:
# Ex. 2.1

known_column = 'featurecla'
known_column_gdf = ExtractTable(shp_path, column=known_column).extract() 
    # alternative: ExtractTable.read_file(shp_path, column=known_column)

known_column_gdf.head()

__Example 2.2.__ Extract a table without a known column label as the index

In [None]:
# Ex. 2.2

unknown_column_et = ExtractTable(shp_path)
columns_list = unknown_column_et.list_columns() # returns a list of columns from which to choose
print(columns_list)

In [None]:
unknown_column_et.column = 'scalerank' # selects the 'scalerank' column as the index

unknown_column_gdf = unknown_column_et.extract()

unknown_column_gdf.head()

Example 3. Extract a subtable
-----------------------------------

__Example 3.1.__ Extract a subtable with a known column value

In [None]:
# Ex. 3.1



__Example 3.2.__ Extract a subtable with multiple known column values

In [None]:
# Ex. 3.2



__Example 3.3.__ Extract a subtable without a known column value

In [None]:
# Ex. 3.3



Example 4. Extract to a file
-------------------------------

In [None]:
# Ex. 4.



---

__Examples Cleanup__

The following command is used to reset and clean up the interactive examples above.

In [None]:
# Remove Installed Module
!echo "y" | rm -r gdutils

In [None]:
# Reset Jupyter Notebook IPython Kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")