## Exercise 3. Update Column Level Metadata on a Dataset
Let's use [Socrata-py](https://github.com/socrata/socrata-py) to add descriptions to the columns of our dataset.

## Import Libraries

In [1]:
import json
import os
import pandas as pd
import requests

from socrata.authorization import Authorization
from socrata import Socrata

## Setup Authentication
- Can enter Socrata user name and password or api keys with key id and secret values respectively
- Enter the domain of dataset if you have publisher or admin access
- Enter the dataset unique ID

Let's update the dataset description for https://alicia.data.socrata.com/dataset/Arizona-Places-Median-Household-Income/9abs-ubh5 to something more meaningful.

In [130]:
# replace os.environ['MY_SOCRATA_USERNAME'] and os.environ['MY_SOCRATA_PASSWORD'] with your credentials
domain = 'alicia.data.socrata.com'
user_name = os.environ['MY_SOCRATA_USERNAME']
password = os.environ['MY_SOCRATA_PASSWORD']

auth = Authorization(
  domain,
  user_name,
  password
)

socrata = Socrata(auth)

## Update columns for a dataset by iterating through a file
- `/data/dataset_metadata.xlsx` contains dataset id and other metadata properties for a dataset that can be suitable  for bulk updates
- Set NaNs to blanks

Let's update columns in https://alicia.data.socrata.com/dataset/Arizona-Places-Median-Household-Income/9abs-ubh5 with many description.

In [2]:
metadata = pd.read_excel('../data/columns_metadata.xlsx')
metadata = metadata.fillna('')
metadata.head()

Unnamed: 0,dataset_id,field_name,name,description
0,9abs-ubh5,name,Name,Geography name
1,9abs-ubh5,type,Type,Census geography type
2,9abs-ubh5,variable_description,Description,Census variable description
3,9abs-ubh5,variable,Variable,Census variable id
4,9abs-ubh5,value,Value,Estimate


## Bill's example
(ok, view) = socrata.views.lookup('cbet-h6p9')
assert ok, view

(ok, rev) = view.revisions.create_replace_revision()
assert ok, rev
(ok, source) = rev.source_from_dataset()
assert ok, source
output_schema = source.get_latest_input_schema().get_latest_output_schema()
(ok, new_output_schema) = output_schema\
    .change_column_metadata('employee_number', 'description').to('meh').run()


rev.apply(output_schema = new_output_schema)

In [134]:
for idx,row in metadata.iterrows():
    dataset_id = row['dataset_id']
    meta_url = 'https://' + domain + '/api/views/metadata/v1/' + dataset_id
    
    payload = dict()
    payload['name'] = row['name']
    payload['description'] = row['description']
    payload['category'] = row['category']

    # if there are multiple keywords, split them by comma
    if(row['tags'] != ''):
        payload['tags'] = row['tags'].split(',')
    else:
        payload['tags'] = []
    
    # empty links return validation error
    if(row['attributionLink']!=''):
        payload['attributionLink'] = row['attributionLink']
    
    # encode json
    json_data = json.dumps(payload)
    # print(json_data)
    
    # req_update = requests.patch(meta_url, json_data, auth=(user_name,password))
    # meta_new = req_update.text
    # print(meta_new)

{
  "action" : "modify",
  "metadata" : {
    "id" : "9abs-ubh5",
    "name" : "Arizona Places Median Household Income",
    "attribution" : "Census",
    "attributionLink" : "https://www.census.gov/programs-surveys/acs/",
    "category" : "Demographics",
    "createdAt" : "2019-02-20T00:22:21+0000",
    "dataUpdatedAt" : "2019-02-20T00:22:38+0000",
    "dataUri" : "https://alicia.data.socrata.com/resource/9abs-ubh5",
    "description" : "Median Household Income - American Community Survey 5 Year Estimates for all Arizona Places from 2011-2017",
    "domain" : "alicia.data.socrata.com",
    "externalId" : null,
    "hideFromCatalog" : false,
    "hideFromDataJson" : false,
    "license" : null,
    "metadataUpdatedAt" : "2019-02-20T02:59:15+0000",
    "provenance" : "OFFICIAL",
    "updatedAt" : "2019-02-20T02:59:15+0000",
    "webUri" : "https://alicia.data.socrata.com/d/9abs-ubh5",
    "customFields" : null,
    "tags" : null
  }
}

{
  "action" : "modify",
  "metadata" : {
    "id" 