# Introduction to SQL for Excel Users – Part 6: More Feature Engineering

[Original post](https://www.daveondata.com/blog/introduction-to-sql-for-excel-users-part-6-more-feature-engineering/)

## More Feature Engineering in SQL – Round 1

Here is one of the queries from the last post that emulates the screenshot above:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
FROM FactCallCenter FCC

Modifying the above SQL to craft the AvgCallsPerOperator column is quite easy:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
      ,FCC.Calls / (FCC.LevelOneOperators + FCC.LevelTwoOperators) AS AvgCallsPerOperator
FROM FactCallCenter FCC

Check out the AvgCallsPerOperator. What happened to the fractional (i.e., decimal) part of the calculation!

If you recall, I mentioned earlier that SQL Server relies on explicit data types.

This also applies to math calculations…

## More Feature Engineering in SQL – Round 2

Check out the structure (i.e., the schema) for FactCallCenter in SSMS.

In looking that above (ignore most of it, I’ll cover more over time), the following are important:

- LevelOneOperators is defined in the DB as an integer (i.e., technically a smallint, don’t worry about that for now)
- LevelTwoOperators is defined in the DB as an integer
- Calls is defined in the DB as an integer

Not surprisingly, given the above, when the AllOperators feature is engineered in the SELECT it is also treated as an integer – it’s simply the result of adding two integer columns.

Since integers are defined as whole numbers, it totally makes sense that when you divide two integers the fractional part (i.e., the stuff to the right of the decimal point) isn’t present.

What is needed is to force at least one of the numbers to be a decimal and then we’ll get the fractional part.

You can use the CAST keyword in SQL to transform data from one type to another.

Using CAST is powerful stuff, but for now I will keep things really simple. Take the following SQL code snippet:

```
CAST(FCC.Calls AS DECIMAL(6,2))
```

The easiest way to think of the above code snippet is from the DB’s perspective:

- “OK, Dave wants me to take the Calls column from the FactCallCenter table and do something with it”
- “Ah, Dave wants me to CAST the column to…”
- “A DECIMAL big enough to have 6 digits in total and 2 digits to the right of the decimal point”

Oh, I should mention that SQL is not case sensitive, but I prefer using capitalization in my SQL code – just in case you were wondering. 😉

With this new SQL goodness I can change the query to be:


In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
      ,CAST(FCC.Calls AS DECIMAL(6,2)) / (FCC.LevelOneOperators + FCC.LevelTwoOperators) AS AvgCallsPerOperator
FROM FactCallCenter FCC

Voila!

NOTE – It doesn’t matter which of the INTEGERs you CAST, just that one of the values needs to be a DECIMAL.

While it might seem that Excel’s way is easier, there are many advantages to knowing that SQL Server doesn’t mess with your data types automagically.

## The Learning Arc

In the next post I’m going to switch gears a bit and explore how you can use SQL to accomplish the same types of calculations that pivot tables provide in Excel. 😲

This is one area that SQL really shines and provides a number of advantages over “out of the box” Excel.

Stay healthy and happy data sleuthing!