# Data enrichment

[Census data available at wikipedia](https://en.wikipedia.org/wiki/List_of_states_and_union_territories_of_India_by_population)

Workflow:
1. Delete the disputed boundaries.
2. Read the census file
3. Add new fields in the attribute table
4. Update the table
5. Add metadata

In [3]:
import arcpy

# Environment settings
arcpy.env.workspace = r"C:\Users\nomitrawat\Documents\Training\Data\Administrative Boundary Database"
arcpy.overwriteOutput = True

# Copy the feature class
result = arcpy.management.Copy("STATE_BOUNDARY_COPY.shp", "STATE.shp")

# Delete the disputed geometries
fields = ["STATE"]

with arcpy.da.UpdateCursor(result, fields) as cursor:
    for row in cursor:
        state_name = row[0]
        
        if state_name[:8] == "DISPUTED":
            cursor.deleteRow()



# Census Data

Source: Wikipedia
Create a dictionary so that adding data to attribute table is easy

In [27]:
census = r"C:\Users\nomitrawat\Documents\Training\Data\India population census.tsv"
f = open(census, "r")
lines = f.readlines()

state_details = {}

#Skip the header
for line in lines[1:]:
    details = line.split("\t")
    state_details[details[0]]=[details[1].strip().replace(",", ""), details[2].strip().replace(",", ""), details[3].strip().replace(",", ""), details[4].strip().replace(",", "").replace("\n", "")]


for key, value in state_details.items():
    print(f"Name of the state :{key}")
    print(f"Population in 2011 :{value[0]}")
    print(f"Population in 2023 :{value[1]}")
    print(f"Urban population 2011 :{value[2]}")
    print(f"Rural population 2011 :{value[3]}")
    # print(f"Total population in 2011 matches sum of rural and urban :{int(value[2])+int(value[3])==int(value[0])}")
    print("\n")

print(f"Total number of the items in state_details : {len(state_details)}")

Name of the state :ANDAMAN & NICOBAR
Population in 2011 :380581
Population in 2023 :403000
Urban population 2011 :237093
Rural population 2011 :143488
Is urban + rural =? pop in 11: True


Name of the state :ANDHRA PRADESH
Population in 2011 :49577103
Population in 2023 :53156000
Urban population 2011 :34966693
Rural population 2011 :14610410
Is urban + rural =? pop in 11: True


Name of the state :ARUNACHAL PRADESH
Population in 2011 :1383727
Population in 2023 :1562000
Urban population 2011 :1066358
Rural population 2011 :317369
Is urban + rural =? pop in 11: True


Name of the state :ASSAM
Population in 2011 :31205576
Population in 2023 :35713000
Urban population 2011 :26807034
Rural population 2011 :4398542
Is urban + rural =? pop in 11: True


Name of the state :BIHAR
Population in 2011 :104099452
Population in 2023 :126756000
Urban population 2011 :92341436
Rural population 2011 :11758016
Is urban + rural =? pop in 11: True


Name of the state :CHANDIGARH
Population in 2011 :1055

Adding new fields in the "STATE.shp" to load all the data. Possible to multiple new fields by creating a list of lists.

In [29]:
arcpy.management.AddFields(result,
                           [['pop11', 'LONG'],
                            ['pop23', 'LONG'],
                            ['urbn_pop11', 'LONG'],
                            ['rurl_pop11', 'LONG']])

Update the newly added fields

In [34]:
fields = ["STATE", "pop11", "pop23", "urbn_pop11", "rurl_pop11"]
with arcpy.da.UpdateCursor(result, fields) as cursor:
    for row in cursor:
        row[1] = state_details[row[0]][0]
        row[2] = state_details[row[0]][1]
        row[3] = state_details[row[0]][2]
        row[4] = state_details[row[0]][3]
    
        cursor.updateRow(row)

# Metadata

Updating the metadata involves
1. Getting the metadata
2. Updating it
3. Save the changes

In [36]:
metadata = arcpy.metadata.Metadata(result)

metadata.title = "India state with population census"
metadata.tags = "India, States, Population"
metadata.summary = "The data contains boundaries of the state along with population census"
metadata.description = "The population census is for the years 2011 and 2023"
metadata.credits = "Survey of India, Wikipedia"

metadata.save()

# Exercise
1. Check for all the states whether the count the sum of urban and the rural population in 2011 is equal to the total population in 2011.
2. If you find any state showing discrepency from first exercise, add note regarding the same in the metadata description.