# Make New GDB
**Filename:** MakeNewGDB.ipynb <br>
**Author:** Laura Kaufmann <br>
**Purpose:** To automate the generation of an empty geodatabase according to the data dictionary<br>
**Methods:**<br>
- Package and global variables, including the Data Dictionary spreadsheet (most recent version uploaded to GitHub)<br>
- Delete (if existing) and create target geodatabase<br>
- Create domains and variables<br>
- Create tables, add fields and defaults, write metadata<br>
- Create relationship classes <br>

**Resources:**<br>
- [WGS 84 Spherical Mercator (WKID 3857)](https://epsg.io/3857)
- Entity Relationship Diagram (the most recent version is also uploaded to GitHub)
- [Create Domain](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-domain.htm)
- [Add Coded Value to Domain](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/add-coded-value-to-domain.htm)
- [Set Value for Range Domain](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/set-value-for-range-domain.htm)
- [Create Table](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-table.htm)
- [Create Feature Class)](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-feature-class.htm)
- [Add Field](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/add-field.htm)
- [Assign Default to Field](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/assign-default-to-field.htm)
- [Metadata Classes](https://pro.arcgis.com/en/pro-app/latest/arcpy/metadata/metadata-class.htm)
- [Create Relationship Class](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-relationship-class.htm)

|Date|Editor|Changes|
|---|---|---|
|10/07/2023|L. Kaufmann|File created|
|11/22/2023|L. Kaufmann|Pushed to GitHub|
||||

In [1]:
# IMPORT PACKAGES
import arcpy
from arcpy import metadata as md

import os
import pandas as pd

import logging
logging.basicConfig(format='%(asctime)s - %(message)s', level=logging.INFO)

# ABOUT THE TARGET GEODATABASE
sr = arcpy.SpatialReference(3857)

fldr = r"C:\Users\Laura\Documents\Keepsakes\Travel\0_MetadataInstructions\2022 Database Migration"
name = r"Travel_Archive"
gdb = name + ".gdb"

# READ IN THE TEMPLATE SPREADSHEET
xlsx_fldr = r"C:\Users\Laura\Documents\Keepsakes\Travel\0_MetadataInstructions"
xlsx_file = r"Data_Dictionary.xlsx"
xlsx = os.path.join(xlsx_fldr, xlsx_file)

# METADATA VARIABLES
credits = "Schema designed and data populated by Laura Kaufmann (lmmk81914@gmail.com)"
constraints = "Data and schema can only be used with written permission from Laura Kaufmann (lmmk81914@gmail.com)"

# FOLDER OF SQL TXT FILES FOR VIEWS
sqlFldr = r"C:\Users\Laura\Documents\Keepsakes\Travel\0_MetadataInstructions\ViewSQL"

# FUNCTIONS
def getValue(argument):
    if argument == 'NONE':
        return None
    else:
        return argument
    
logging.info('Packages imported; ready to begin')

2023-11-24 10:55:32,188 - Packages imported; ready to begin


In [2]:
# DELETE AND CREATE THE TARGET GEODATABASE
arcpy.management.Delete(os.path.join(fldr, gdb), '')
arcpy.management.CreateFileGDB(fldr, name, "CURRENT")

wrkspc = os.path.join(fldr, gdb)
arcpy.env.overwriteOutput = True

logging.info('Blank file geodatabase created')

2023-11-24 10:57:36,375 - Blank file geodatabase created


In [3]:
# CREATE DOMAINS AND ADD VALUES
domains = pd.read_excel(xlsx, sheet_name='Domains')
domainValues = pd.read_excel(xlsx, sheet_name='DomainValues')

for index, row in domains.iterrows():
    domain_name = row['Name']
    domain_description = row['Description']
    field_type = row['FieldType']
    domain_type = row['DomainType']
    split_policy = row['SplitPolicy']
    merge_policy = row['MergePolicy']

    arcpy.management.CreateDomain(wrkspc, domain_name, domain_description, field_type, domain_type, split_policy, merge_policy)
        
    for index, row in domainValues.iterrows():
        if row['Name'] == domain_name:
            if domain_type == 'CODED':
                arcpy.management.AddCodedValueToDomain(wrkspc, domain_name, row['Code'], row['ValueDescription'])
            else:
                arcpy.management.SetValueForRangeDomain(wrkspc, domain_name, row['MinValue'], row['MaxValue'])

    logging.info('%s domain and values added to the geodatabase', domain_name)

  for idx, row in parser.parse():
  for idx, row in parser.parse():
2023-11-24 10:57:55,075 - VoteType_CL domain and values added to the geodatabase
2023-11-24 10:58:09,068 - TripStage_CL domain and values added to the geodatabase
2023-11-24 10:58:30,423 - DayofWeek_CL domain and values added to the geodatabase
2023-11-24 10:58:57,118 - Currency_CL domain and values added to the geodatabase
2023-11-24 10:59:26,101 - TicketType_CL domain and values added to the geodatabase
2023-11-24 11:00:28,042 - Hour_CL domain and values added to the geodatabase
2023-11-24 11:00:50,054 - Minute_CL domain and values added to the geodatabase
2023-11-24 11:01:41,712 - LocationType_CL domain and values added to the geodatabase
2023-11-24 11:02:02,147 - Interest_CL domain and values added to the geodatabase
2023-11-24 11:02:15,728 - YesNoNAUnk_CL domain and values added to the geodatabase
2023-11-24 11:02:18,790 - MeasType_CL domain and values added to the geodatabase
2023-11-24 11:03:17,344 - Months_CL d

In [4]:
# CREATE TABLES
tables = pd.read_excel(xlsx, sheet_name='Tables')
tables = tables.fillna('NONE')

fields = pd.read_excel(xlsx, sheet_name='Fields')
fields = fields.fillna('NONE')

for index, row in tables.iterrows():
    
    out_name = row['Name']
    geometry_type = row['Geometry']
    has_m = row['HasM']
    has_z = row['HasZ']
    summary = row['TableDefinition']
    
    if row['Module'] == 'None':
        tag = geometry_type.capitalize()
    else:
        tag = "{}, {}".format(row['Module'], geometry_type.capitalize())
        
    if geometry_type == 'TABLE':
        arcpy.management.CreateTable(wrkspc, out_name, None, '', '')
        logging.info('%s table created in the geodatabase', out_name)
    else:
        arcpy.management.CreateFeatureclass(wrkspc, out_name, geometry_type, None, has_m, has_z, sr)
        arcpy.management.RemoveSpatialIndex(os.path.join(wrkspc, out_name))
        logging.info('%s feature class created in the geodatabase', out_name)
    
    mdDesc = []
    
    for index, row in fields.iterrows():
        if row['Table'] == out_name:
            
            field_name = getValue(row['FieldName'])
            field_type = getValue(row['FieldType'])
            field_precision = getValue(row['Precision'])
            field_scale = getValue(row['Scale'])
            field_length = getValue(row['Length'])
            field_alias = getValue(row['FieldAlias'])
            field_is_nullable = getValue(row['Nullable'])
            field_is_required = getValue(row['Required'])
            field_domain = getValue(row['FieldDomain'])
            field_default = getValue(row['DefaultValue'])
            
            arcpy.management.AddField(os.path.join(wrkspc, out_name), field_name, field_type, field_precision, field_scale, field_length, field_alias, field_is_nullable, field_is_required, field_domain)
            
            if field_default != None:
                arcpy.management.AssignDefaultToField(os.path.join(wrkspc, out_name), field_name, field_default)
                setDesc = " (Default: {})".format(field_default)
            
            if field_domain != None:
                setDesc = " ({})".format(field_domain)
            
            if field_default != None and field_domain != None:
                setDesc = " (Default: {} ({}))".format(field_default, field_domain)
            else:
                setDesc = ""
            
            fieldDesc = "{} ({}) - {}{}".format(field_name, field_type.capitalize(), row['FieldDefinition'], setDesc)
            mdDesc.append(fieldDesc)
            
    logging.info('Fields added to %s', out_name)
    
    new_md = md.Metadata()
    new_md.title = out_name
    new_md.tags = tag
    new_md.summary = summary
    new_md.description = '\n'.join(mdDesc)
    new_md.credits = credits
    new_md.accessConstraints = constraints
    
    tgt_item_md = md.Metadata(os.path.join(wrkspc, out_name))
    if not tgt_item_md.isReadOnly:
        tgt_item_md.copy(new_md)
        tgt_item_md.save()

  for idx, row in parser.parse():
  for idx, row in parser.parse():
2023-11-24 11:06:36,015 - Travelers table created in the geodatabase
2023-11-24 11:07:01,175 - Fields added to Travelers
2023-11-24 11:07:05,693 - Travelers_Contacts table created in the geodatabase
2023-11-24 11:07:32,632 - Fields added to Travelers_Contacts
2023-11-24 11:07:39,220 - Travelers_Facts table created in the geodatabase
2023-11-24 11:07:50,712 - Fields added to Travelers_Facts
2023-11-24 11:08:00,674 - Regions feature class created in the geodatabase
2023-11-24 11:09:11,287 - Fields added to Regions
2023-11-24 11:09:19,594 - Regions_Countries table created in the geodatabase
2023-11-24 11:10:56,146 - Fields added to Regions_Countries
2023-11-24 11:11:04,622 - Regions_Averages table created in the geodatabase
2023-11-24 11:11:27,127 - Fields added to Regions_Averages
2023-11-24 11:11:51,871 - Locations feature class created in the geodatabase
2023-11-24 11:13:20,973 - Fields added to Locations
2023-11-24 11

In [5]:
# RELATIONSHIP CLASSES
relationships = pd.read_excel(xlsx, sheet_name='Relationship Classes')
relationships = relationships.fillna('NONE')

for index, row in relationships.iterrows():
    
    origin_table = os.path.join(wrkspc, getValue(row['OriginTable']))
    destination_table = os.path.join(wrkspc, getValue(row['DestinationTable']))
    out_relationship_class = os.path.join(wrkspc, getValue(row['RelationshipClass']))
    relationship_type = getValue(row['RelationshipType'])
    forward_label = getValue(row['ForwardLabel'])
    backward_label = getValue(row['BackwardLabel'])
    message_direction = getValue(row['MessageDirection'])
    cardinality = getValue(row['Cardinality'])
    attributed = getValue(row['Attributed'])
    origin_primary_key = getValue(row['O_PrimaryKey'])
    origin_foreign_key = getValue(row['O_PrimaryKey'])
    destination_primary_key = getValue(row['D_ForeignKey'])
    destination_foreign_key = getValue(row['D_ForeignKey'])
    
    arcpy.management.CreateRelationshipClass(origin_table, destination_table, out_relationship_class, relationship_type, forward_label, backward_label, message_direction, cardinality, attributed, origin_primary_key, origin_foreign_key, destination_primary_key, destination_foreign_key)
    logging.info('Relationship created from %s to %s', row['OriginTable'], row['DestinationTable'])

  for idx, row in parser.parse():
2023-11-24 11:20:57,054 - Relationship created from Trips to Travelers
2023-11-24 11:21:12,159 - Relationship created from Travelers to Travelers_Facts
2023-11-24 11:21:27,668 - Relationship created from Travelers to Travelers_Contacts
2023-11-24 11:21:50,392 - Relationship created from Trips to WorldHex15000
2023-11-24 11:22:05,493 - Relationship created from Trips to Regions
2023-11-24 11:22:19,558 - Relationship created from Regions_Countries to Regions
2023-11-24 11:22:33,067 - Relationship created from Regions to Regions_Averages
2023-11-24 11:22:55,412 - Relationship created from Regions to Locations
2023-11-24 11:23:08,838 - Relationship created from Locations to Locations_Hours
2023-11-24 11:23:22,696 - Relationship created from Locations to Locations_Tickets
2023-11-24 11:23:36,407 - Relationship created from Locations to Locations_Notes
2023-11-24 11:23:59,903 - Relationship created from Regions to Regions_Taxis
2023-11-24 11:24:14,674 - Rela