# Introduction to SQL for Excel Users – Part 5: Basic Feature Engineering

[Original post](https://www.daveondata.com/blog/introduction-to-sql-for-excel-users-part-5-basic-feature-engineering/)

## Feature Engineering in SQL

As we’ll see throughout the series, you can do a lot of cool thing with your SQL SELECTs.

One of these things is engineering new features from existing columns.

Take our query from the previous post:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
FROM FactCallCenter FCC

We can craft a new feature in our SQL quite easily by just adding LevelOneOperators to LevelTwoOperators like this:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators
FROM FactCallCenter FCC

Sweet! That’s not too complicated.

If you execute the above query in SQL Server Management Studio (SSMS) check out (No column name) in the output.

As with Excel, you have to provide a name (i.e., an alias) for this new column. While there are many legit ways to do this, I will use the following throughout the series:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
FROM FactCallCenter FCC

If you re-run the query in SSMS you now get a name for the column.

Awesome!

There’s just one more thing I need to cover…

## Engineering a Feature From a Feature in SQL

Based on knowledge of Excel, it would be quite intuitive to engineer AllOperators2 using SQL thusly:

```
-- incomplete SQL - invalid column name
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
      ,AllOperators - 42 AS AllOperators2
FROM FactCallCenter FCC
```

If you run the above query, however, SSMS is not happy.

If we check FactCallCenter in SSMS we can quickly see the reason why: There is no AllOperators listed as a column!

😕

The above is an example of one of the most important differences between Excel and T-SQL.

As an analytics pro, your SQL queries almost always operate on/with virtual tables.

This is evidenced ☝ by the query that produces the AllOperators column via feature engineering.

The SELECT, in essence, creates an in-memory copy of the data you asked for and then adds to this in-memory copy the AllOperators virtual column.

BTW – In this case it’s intuitive to think of “in-memory” as “existing only long enough to get the data back.”

Virtual tables are super awesome and allow you to do many wondrous things that can’t be done with “out of the box Excel” (i.e., without resorting to technology like DAX)

Given this knowledge of how the SELECT is creating a virtual table, this is how the query can be written to produce AllOperators2:

In [None]:
SELECT FCC.WageType
      ,FCC.Shift
      ,FCC.LevelOneOperators
      ,FCC.LevelTwoOperators
      ,FCC.Calls
      ,FCC.Date
      ,FCC.LevelOneOperators + FCC.LevelTwoOperators AS AllOperators
      ,(FCC.LevelOneOperators + FCC.LevelTwoOperators) - 42 AS AllOperators2
FROM FactCallCenter FCC

Awesome!

I can see where you might be thinking, “Dave, this virtual table stuff just seems like more work and not very cool at all.”

That’s a fair point given this wildly contrived simple example.

Stay tuned, the awesomeness shall be demonstrated!

## The Learning Arc

In the next post I’m going to continue with feature engineering as it relates to explicit data types in SQL.

This is another area where SQL is quite a bit different compared to Excel.

At this point, you should really get access to SQL Server. Be sure to check out my YouTube channel for tutorial videos to make this happen.

Stay healthy and happy data sleuthing!