Gamas Flour Mill is the largest producer of flour, pasta, and noodles across the 50 states of the United States of America. With an extensive network of branches in every state and an annual revenue exceeding $1 billion, Gamas continues to lead the market in food production and distribution.
This project showcases SQL queries and analysis on production data from Gamas Flour Mill, the largest producer of flour, pasta, and noodles in the United States. For data-driven insights and performance tracking, we collected and consolidated data from the various divisions of Gamas Flour Mill into four primary tables:
flour_productionpasta_productionnoodle_productionstate_lookup
yearperiodstate_ansicommodity_idvalue
state_ansi– used as a foreign key to join with the commodity tablesstate_name
Using SQL, we will join these four tables to perform unified multi-commodity analysis across all U.S. states. This will enable insights such as:
Retrieved a complete list of U.S. states along with their ANSI codes from the state_lookup table to support clean joins across datasets.
SELECT state_name, state_ansi
FROM state_lookup;Aggregated pasta production data for 2019 to support the yearly management report.
SELECT SUM(value) AS total_pasta_2019
FROM pasta_production
WHERE year = '2019';Calculated the total noodle production to assess past performance.
SELECT SUM(value) AS total_noodle_2010
FROM noodles_production
WHERE year = '2010';Prepared this insight for the Flour division.
SELECT AVG(value) AS avg_flour_2016
FROM flour_production
WHERE year = '2016';Computed the average flour production for all years where noodle production exceeded 1 million units. This helped cross-reference commodity impact.
SELECT AVG(f.value) AS average_flour_production
FROM flour_production f
WHERE f.year IN (
SELECT DISTINCT year
FROM noodle_production np
WHERE value > 1000000
);Identified states present in state_lookup but missing pasta production entries for 2021 to address data completeness and reporting gaps.
SELECT COUNT(DISTINCT s.state_name) AS missing_states_in_2021
FROM state_lookup s
LEFT JOIN pasta_production p
ON s.state_ansi = p.state_ansi AND p.year = '2021'
WHERE p.state_ansi IS NULL;7. The total pasta production figures for all states in 2017 that also have recorded noodle production data from 2018.
Found states with pasta production in 2017 that also had noodle data in 2018 to support inter-commodity correlation analysis.
SELECT SUM(pp.value) AS total_pasta_2017
FROM pasta_production pp
WHERE pp.year = '2017'
AND pp.state_ansi IN (
SELECT DISTINCT np.state_ansi
FROM noodle_production np
WHERE np.year = '2018'
);8. All states with Flour Production in April 2020,even if they didn’t produce any flour in April of 2020? (including zero).
Listed all states with their flour production in April 2020 (even if zero), supporting a cross-commodity production report for that period.
SELECT s.state_name,
SUM(f.value) AS total_flour_apr_2020
FROM state_lookup s
LEFT JOIN flour_production f
ON s.state_ansi = f.state_ansi
AND f.period = 'APR'
AND f.year = '2020'
GROUP BY s.state_name;Identified states with flour production greater than $500M in April 2021 to assist the flour Department’s regional marketing strategy.
SELECT l.state_name, p.period, p.year, p.value, l.state_ansi
FROM flour_production p
INNER JOIN state_lookup l
ON p.state_ansi = l.state_ansi
WHERE p.value > 500000000
AND p.period = 'APR'
AND p.year = '2021';
--- Count of such states
SELECT COUNT(DISTINCT l.state_name) AS high_flour_states
FROM flour_production p
JOIN state_lookup l ON p.state_ansi = l.state_ansi
WHERE p.value > 500000000
AND p.period = 'APR'
AND p.year = '2021';- SQL (Intermediate – Advanced)
Used advanced SQL techniques such as
JOIN,LEFT JOIN, subqueries, aggregation (SUM,AVG),DISTINCT, and conditional filtering. - Data Cleaning & Exploration Ensured accuracy by filtering out null values, misjoins, and using appropriate conditions for time-specific and state-specific queries.
- Business Reporting Aligned analysis with department-specific goals (e.g. Flour Department) to produce actionable insights.
- SQL (MySQL syntax)
- Total and average production per year
- States missing data for quality checks
- Production trends across time and commodities
- States with exceptional production metrics
- This project provided valuable insights into commodity production across U.S. states, supported decision-making for inter-departmental planning, and showcased data-driven storytelling using SQL.
Kelvin Ayogu – Data Analyst | Operation Analyst | Business Analyst
Disclaimer: All SQL scripts and datasets used in this project are created solely for portfolio and demonstration purposes. They do not contain or represent any proprietary, confidential, or real company data. Any resemblance to actual business processes or datasets is purely coincidental.
