**BACKGROUND:** As the saying goes: "change happens" or "change is inevitable".  The challenge is how to adapt to the changes.  In the realm of SQL queries, having to update or change SQL queries can be a maintenance nightmare.  If you are well versed in SQL, you already know that in complicated SQL queries, it isn't a matter of simply adding, removing, or renaming columns in just one part of your SQL query.  Often times, you have to do the exact same modifications in several locations of your SQL query.  In addition, we know that debugging SQL can be very difficult or time consuming due to notoriously unhelpful error messages.  This is where a "templating engine" or "template engine" is so super useful.  A templating engine essentially allows you to make same modifications to multiple places of your SQL query simultaneously.  Anything we can do to reduce or eliminate manual modifications to SQL queries would be ideal since typos will eventually bite us.  The reason why it is called a "templating engine" is because you literally create a template of the base structure of your query and then the engine injects or inserts text into designated locations in your template to create the actual query (or any text-based document) that needs to be created.  Those automated, seemingly personalized emails that you receive?  Chances are, they were created with a templating engine.  Templating engines have been in use for several years to automatically generate all sorts of text documents and not just code or SQL queries.

**Use Case:** This notebook explains the use of using a Python-based template engine called [Jinja](https://jinja.palletsprojects.com/) to dynamically generate a SQL query to flatten a vendor's data.  In this example, a SQL query needs to be generated to include several feature IDs and their respective feature names.  The problem is, things can change as mentioned already (we may need to later on add a feature or change a feature).  The solution?  By just maintaining a simple table containing the feature IDs and their corresponding feature names, we can use Jinja to dynamically generate the SQL from this table for us, thus removing the need to hard-code the feature IDs and feature names.

**Jinja2 Resources**

- [Tutorial](https://ttl255.com/jinja2-tutorial-part-2-loops-and-conditionals/)
- [Handling comma in front or not in front of column names](https://towardsdatascience.com/advanced-sql-templates-in-python-with-jinjasql-b996eadd761d)

In [1]:
import pandas as pd

#### Import sample data from "clipboard" containing feature id to feature mapping

In [2]:
d1_to_tv_features = pd.read_clipboard()

In [3]:
d1_to_tv_features

Unnamed: 0,feature_id,feature
0,3645,_360_DEGREE_CAMERA
1,20,ACTIVE_HIGH_BEAMS
2,168,ADAPTIVE_CRUISE_CONTROL
3,1128,ADAPTIVE_HEADLIGHTS
4,1338,ANTI_THEFT_ALARM
5,1372,ANTI_THEFT_ENGINE_IMMOBILIZER
6,3147,AUTOMATIC_PARKING_ASSIST_FULL
7,3148,AUTOMATIC_PARKING_ASSIST_SEMI
8,4224,BLIND_SPOT_INTERVENTION
9,129,BLIND_SPOT_WARNING


Let's pretend the data above represents a table containing our vendor's features id to corresponding vehicle feature name that we would maintain.  The vendor's feature IDs and the vehicle feature names contained in this table are what will be injected into our SQL query template.  But before doing so, we need to convert the tabular data above as a Python dictionary data structure or "key-value" format:

In [4]:
features_dict = d1_to_tv_features.to_dict('records')

In [5]:
features_dict

[{'feature_id': 3645, 'feature': '_360_DEGREE_CAMERA'},
 {'feature_id': 20, 'feature': 'ACTIVE_HIGH_BEAMS'},
 {'feature_id': 168, 'feature': 'ADAPTIVE_CRUISE_CONTROL'},
 {'feature_id': 1128, 'feature': 'ADAPTIVE_HEADLIGHTS'},
 {'feature_id': 1338, 'feature': 'ANTI_THEFT_ALARM'},
 {'feature_id': 1372, 'feature': 'ANTI_THEFT_ENGINE_IMMOBILIZER'},
 {'feature_id': 3147, 'feature': 'AUTOMATIC_PARKING_ASSIST_FULL'},
 {'feature_id': 3148, 'feature': 'AUTOMATIC_PARKING_ASSIST_SEMI'},
 {'feature_id': 4224, 'feature': 'BLIND_SPOT_INTERVENTION'},
 {'feature_id': 1327, 'feature': 'BRAKE_ASSIST'},
 {'feature_id': 1508, 'feature': 'BRAKE_OVERRIDE'},
 {'feature_id': 46, 'feature': 'DAYTIME_RUNNING_LIGHTS_HALOGEN'},
 {'feature_id': 3055, 'feature': 'DAYTIME_RUNNING_LIGHTS_LED'},
 {'feature_id': 3053, 'feature': 'DRIVER_ATTENTION_ALERT'},
 {'feature_id': 603, 'feature': 'EMERGENCY_BRAKING_PREPARATION'},
 {'feature_id': 4268, 'feature': 'FORWARD_COLLISION_MITIGATION'},
 {'feature_id': 161, 'feature': 'F

From above, we see that what we end up with is a Python list containing 30 dictionaries, where an id is mapped to a feature.

#### Now comes the fun or interesting part - Using Jinja

In [6]:
from jinja2 import Template

The query template below is using Jinja with its looping capability (look for the `{` and `{%` syntax) to loop through the features in a Python dictionary that we already defined above (`features_dict`).  See lines 24 through 26, 39, 40, and lines 45 through 47 below.  It is this scripting syntax or capability that gives Jinja its super powers.

In [7]:
template_standard_generic_equipment = """
with standard_generic_equip_long as (
select
    vin
    , lf_styles.value:basic_data:year::integer as YEAR
    , lf_styles.value:basic_data:make::STRING AS make
    , lf_styles.value:basic_data:model::STRING AS model
    , lf_styles.value:basic_data:is_oem_build_data::STRING AS is_oem_build_data
    , lf_styles.value:basic_data:d1_verified_record_level::STRING d1_verified_record_level
    , array_size(raw_json['query_responses']['python3_example']['us_market_data']['us_styles']) AS style_count
    , lf_styles.value:name::string as style_name
    , lf_styles.value:complete::string as complete_style
    , lf_generic_equipment_values.value:generic_equipment_id::number as generic_equipment_id
    , lf_generic_equipment_values.value:generic_equipment_value::string as generic_equipment_value
from
    vehicle_data_eval.d1_raw_json_pl
    , lateral flatten(raw_json['query_responses']['python3_example']['us_market_data']['us_styles']) as lf_styles
    , lateral flatten(lf_styles.value:standard_generic_equipment) AS lf_standard_generic_equip
    , lateral flatten(lf_standard_generic_equip.value:generic_equipment_categories) as lf_generic_equipment_categories
    , lateral flatten(lf_generic_equipment_categories.value:generic_equipment) as lf_generic_equipment
    , lateral flatten(lf_generic_equipment.value:generic_equipment_values) as lf_generic_equipment_values
where
    lf_generic_equipment_values.value:generic_equipment_id IN(
        '{{ features_dict[0]['feature_id'] }}'\
        {% for feature in features_dict[1:] %}
        , '{{ feature['feature_id'] }}'{% endfor %}
    )
)   -- END WITH
SELECT
    VIN 
    , style_count
    , YEAR  
    , make 
    , model 
    , is_oem_build_data 
    , d1_verified_record_level 
    , style_name
    , complete_style 
    {% for feature in features_dict %}
    , "'{{ feature['feature_id'] }}'" AS {{ feature['feature'] }}{% endfor %}
from
    standard_generic_equip_long
    pivot (min(generic_equipment_value)
        for generic_equipment_id in (
            '{{ features_dict[0]['feature_id'] }}'\
            {% for feature in features_dict[1:] %}
            , '{{ feature['feature_id'] }}'{% endfor %}
        )
    )
order by
    vin
    , style_name
"""

#### The Generated SQL Query

With the template defined above and the features mapping that we provide in `features_dict`, Jinja can now dynamically generate the actual SQL query that we need.

In [8]:
j2_template = Template(template_standard_generic_equipment)

print(j2_template.render(features_dict=features_dict))


with standard_generic_equip_long as (
select
    vin
    , lf_styles.value:basic_data:year::integer as YEAR
    , lf_styles.value:basic_data:make::STRING AS make
    , lf_styles.value:basic_data:model::STRING AS model
    , lf_styles.value:basic_data:is_oem_build_data::STRING AS is_oem_build_data
    , lf_styles.value:basic_data:d1_verified_record_level::STRING d1_verified_record_level
    , array_size(raw_json['query_responses']['python3_example']['us_market_data']['us_styles']) AS style_count
    , lf_styles.value:name::string as style_name
    , lf_styles.value:complete::string as complete_style
    , lf_generic_equipment_values.value:generic_equipment_id::number as generic_equipment_id
    , lf_generic_equipment_values.value:generic_equipment_value::string as generic_equipment_value
from
    vehicle_data_eval.d1_raw_json_pl
    , lateral flatten(raw_json['query_responses']['python3_example']['us_market_data']['us_styles']) as lf_styles
    , lateral flatten(lf_styles.value:standar

The above is the resulting SQL that was generated with Jinja templating engine.  We can see that Jinja entered or "typed" the feature IDs and feature names for us in the locations that we need it to, along with the proper commas and single quotes or double quotes.  This may not seem interesting or profound at first, but just imagine how you would have to change the query if we had to removed certain features or rename features.  To do this manually or without a templating engine would have been very error-proned!  Using Jinja templating engine, all we have to do is pass in or provide a data dictionary or mapping, then let Jinja do the rest.

As with any tool we use, there are PROs and CONs.  So what are the CONs for using a template engine?  You do have to learn the templating engine's scripting syntax.  After you get used to the weird syntax, it is a relatively easy or short learning curve.  We have to keep in mind that because of its scripting capability is why using a templating engine is so powerful.