# Summarize data using Azure Databricks

Select the scored data generated by the Azure Data Factory pipeline

In [0]:
%sql
select * from scoredflights

Run the previous cell. You should see a table displayed with the scored data. Scroll all the way to the side. There you will find the prediction column containing the flight delay prediction provided by your machine learning model.

In the following cell, you will create a table that summarizes the flight delays data. Instead of containing one row per flight, this new summary table will contain one row per origin airport at a given hour, along with a count of the quantity of anticipated delays. We also join the **airport_code_location_lookup_clean** table you created at the beginning of the lab, so we can extract the airport coordinates.

In [0]:
%sql
SELECT  OriginAirportCode, Month, DayofMonth, CRSDepHour, Sum(prediction) NumDelays,
    CONCAT(Latitude, ',', Longitude) OriginLatLong
    FROM scoredflights s
    INNER JOIN airport_code_location_lookup_clean a
    ON s.OriginAirportCode = a.Airport
    WHERE Month = 4
    GROUP BY OriginAirportCode, OriginLatLong, Month, DayofMonth, CRSDepHour
    Having Sum(prediction) > 1
    ORDER BY NumDelays DESC

The final step is to save this summary calculation as a table, which we can later query using Power BI (in the next exercise).

In [0]:
summary = spark.sql("SELECT  OriginAirportCode, Month, DayofMonth, CRSDepHour, Sum(prediction) NumDelays,     CONCAT(Latitude, ',', Longitude) OriginLatLong FROM scoredflights s INNER JOIN airport_code_location_lookup_clean a ON s.OriginAirportCode = a.Airport WHERE Month = 4 GROUP BY OriginAirportCode, OriginLatLong, Month, DayofMonth, CRSDepHour  Having Sum(prediction) > 1 ORDER BY NumDelays DESC")

In [0]:
summary.write.mode("overwrite").saveAsTable("flight_delays_summary")

Execute the following to verify the table has data

In [0]:
%sql
select * from flight_delays_summary

## Next steps

You are done executing notebooks in this lab. Please continue to the next exercise, **Exercise 7: Visualizing in Power BI Desktop** in the hands-on lab document.