## Exercise 4. Add a Transform
Let's use [Socrata-py](https://github.com/socrata/socrata-py#transforming-your-data) to apply formatting to the dataset columns for https://alicia.data.socrata.com/dataset/Arizona-Places-Median-Household-Income/9abs-ubh5.

- Title Case geography type

## Import Libraries

In [8]:
import json
import os
import pandas as pd
import requests

from socrata.authorization import Authorization
from socrata import Socrata

## Setup Authentication
- Can enter Socrata user name and password or [api keys](https://socrataapikeys.docs.apiary.io) with key id and secret values respectively
- Enter the domain of dataset if you have publisher or admin access
- Enter the dataset unique ID

In [9]:
# replace environmement variables with your credentials on lab machines
domain = 'alicia.data.socrata.com'
user_name = os.environ['SOCRATA_KEY_ID']
password = os.environ['SOCRATA_KEY_SECRET']
dataset_id = '9abs-ubh5'

auth = Authorization(
  domain,
  user_name,
  password
)

socrata = Socrata(auth)

## Title Case
- Apply [title_case()](https://dev.socrata.com/docs/transforms/title_case.html) to Geography Type field

In [11]:
# make the change to Denominator column to be a number
(ok, view) = socrata.views.lookup(dataset_id)
assert ok, view

(ok, revision) = view.revisions.create_replace_revision()
assert ok, revision

(ok, source) = revision.source_from_dataset()
assert ok, source

output_schema = source.get_latest_input_schema().get_latest_output_schema()

# use 
(ok, new_output_schema) = output_schema\
    .change_column_transform('type').to('title_case(`type`)').run()

revision.apply(output_schema = new_output_schema)

(True, Job({'created_at': '2019-03-28T19:56:42.530851Z',
  'created_by': {'display_name': 'Alicia Brown',
                 'email': 'alicia.brown@socrata.com',
                 'user_id': '7krv-k7t3'},
  'finished_at': None,
  'id': 343699,
  'is_edit': True,
  'job_uuid': 'b5b78235-23d8-4d31-ac56-e4b51a05ce44',
  'log': [],
  'output_schema_id': 547575,
  'request_id': 'e72a1fa465998c216daad2d71cc7fd2a',
  'status': 'initializing',
  'updated_at': '2019-03-28T19:56:42.530861Z'}))

## Replace Estimate Errors
- Apply [case()](https://dev.socrata.com/docs/transforms/case.html) to Value Field that were not estimated for a given year and geography per Annotation by Census
- Setting these estimation errors with values of `-666,666,666` to an empty string and then applying [forgive](https://dev.socrata.com/docs/transforms/forgive.html) because we don't want to replace missing values with zero

In [20]:
# make the change to Denominator column to be a number
(ok, view) = socrata.views.lookup(dataset_id)
assert ok, view

(ok, revision) = view.revisions.create_replace_revision()
assert ok, revision

(ok, source) = revision.source_from_dataset()
assert ok, source

output_schema = source.get_latest_input_schema().get_latest_output_schema()

# use 
(ok, new_output_schema) = output_schema\
    .change_column_transform("value").to("case(`value` < 0, forgive(to_number('')), true, `value`)").run()

revision.apply(output_schema = new_output_schema)

(True, Job({'created_at': '2019-03-28T20:01:25.488699Z',
  'created_by': {'display_name': 'Alicia Brown',
                 'email': 'alicia.brown@socrata.com',
                 'user_id': '7krv-k7t3'},
  'finished_at': None,
  'id': 343710,
  'is_edit': True,
  'job_uuid': 'b275aa40-6cb0-4819-a96b-af4dff07837a',
  'log': [],
  'output_schema_id': 547655,
  'request_id': 'dde2b66496ed66cc3ddbf1e32c7d6d3c',
  'status': 'initializing',
  'updated_at': '2019-03-28T20:01:25.488711Z'}))