# Pumpkins: Where do they come from, where do they go?

![Pumpkins at the French Market in New Orleans](https://upload.wikimedia.org/wikipedia/commons/5/5c/FrenchMarketPumpkinsB.jpg)

Infrogmation of New Orleans / CC BY-SA (https://creativecommons.org/licenses/by-sa/2.0)

To do:
-Analyze what states pumpkins primarily come from
-Analyze what the most popular destinations are
-Figure out distance they travel from origin to destination


Contents:
1. Background
2. Origins and Destinations
3. Economic Impact
4. Environmental Impact

## Background

What is a terminal market? How is this different from a supermarket?

When an individual in the United States goes to purchase food, they usually do so at one of a few select types of places. This might be a supermarket, big box store, bodega, neighborhood grocer, or farmer's market. Sometimes people don't cook themselves and instead visit restaurants, from large fast food chains to local cafes. Generally, none of these places are where the food starts off; instead they are at the end of chains of growers, storage facilities, and merchants. 

The data in this dataset specifically covers "terminal markets", a type of large, wholesale market where commodities are bought and sold. Those in the dataset are specifically agricultural wholesale markets that sell large amounts of fruits, vegetables, legumes, and other edibles.

### Research Questions

When examining this data, I had a few questions that I wanted to answer, some immediately visable, others requiring more detailed research and exploration.

* Where do most of the pumpkins in the US come from?
* Where are they shipped to?
* What is the impact of shipping to these markets on the environment? How much does this transportation cost?
* Are there any conclusions drawn from these questions that can be applied more generally towards agricultural or commodity distribution in the United States as a whole?

### Importing Packages

In [None]:
import numpy as np
import pandas as pd
import os

import matplotlib
import matplotlib.pyplot as plt

### Importing the dataset using Pandas

In [None]:
directory = '../input/a-year-of-pumpkin-prices'
city_names = ['atlanta','baltimore','boston','chicago','columbia','dallas','detroit',
             'los-angeles','miami','new-york','philadelphia','san-fransisco','st-louis']
file_extension = '_9-24-2016_9-30-2017.csv'

city_dfs = {}
city_sizes = {}

for city in city_names:
    file_name = directory + '/' + city + file_extension
    new_df = pd.read_csv(file_name)
    city_sizes[city] = new_df.shape[0]
    city_dfs[city] = new_df
    
full_df = pd.concat(city_dfs)
#Dataframe containing information from all cities, useful for looking at the data as a whole

In [None]:
origin_locations = full_df['Origin'].value_counts().to_dict()

def get_colors(data):
    return [plt.cm.terrain(i/float(len(data.keys()))) for i in range(len(data.keys()))]

fig = plt.figure(figsize = (14, 8))
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)

ax1.bar(origin_locations.keys(), origin_locations.values(), color = colors1)

def make_a_pretty_graph(data, ax, title, xlabel, ylabel):
    colors = get_colors(data)
    ax.bar(data.keys(), data.values(), color=colors)
    for i, (k, v) in enumerate(data.items()):
        ax.text(k, # where to put the text on the x coordinates
        v + 4, # where to put the text on the y coordinates
        v, # value to text
        fontsize = 10, # fontsize
        horizontalalignment = 'center', # center the text to be more pleasant
        verticalalignment = 'center'
        )
    ax.tick_params(axis = 'x', labelrotation = 90, labelsize = 12)
    ax.tick_params(axis = 'y', labelsize = 12)
    ax.set_title(title, fontsize=14)
    ax.set_xlabel(xlabel, fontsize=12, labelpad=4)
    ax.set_ylabel(ylabel, fontsize=12, labelpad=2)

make_a_pretty_graph(origin_locations, ax1, "Pumpkin States of Origin by Frequency","Location of Origin","# of Orders")
make_a_pretty_graph(city_sizes, ax2, "Pumpkin Destinations by Frequency","Destination","# of Orders")

From examining these graphs, it appears that pumpkins grow well in a wide range of climates. While three of the top five producing states (Pennsylvania, Michigan, and Massachusetts) are in cooler, more temperate regions, the other two are usually warmer and drier (Texas and California).

Pumpkins generally need around 3 months to grow and don't survive a frost, so it makes sense that they are rarely grown in areas with long periods of cold.

In [None]:
#growing_zones = { 'pennslyvania':['6a','6b','5a','5b'],'michigan':['6a','6b','5a','5b','4b','4a'],
#                'california','massaachusetts'}
