In [1]:
# Wrangling And Writing a New CSV file for USDA Production Data

In [2]:
# Load Packages
library(readr)
library(dplyr)
library(ggplot2)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [3]:
# Load Data

# Reading USDA_Production_Supply_And_Distribution File

USDA_Production_Supply_And_Distribution <- read_csv("../Data/USDA_Production_Supply_And_Distribution.csv")

New names:
* `` -> ...1

[1mRows: [22m[34m9402563[39m [1mColumns: [22m[34m9[39m

[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (5): Commodity_Description, Country_Code, Country_Name, Attribute_Descri...
[32mdbl[39m (4): ...1, Year, Value, Seen_On


[36mℹ[39m Use [30m[47m[30m[47m`spec()`[47m[30m[49m[39m to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set [30m[47m[30m[47m`show_col_types = FALSE`[47m[30m[49m[39m to quiet this message.



In [4]:
# Filtering USDA data set for the following:
# Year >= 1998 & Year <= 2012
# Attribute_Description == 'Production'
# Country_Code == 'US'

Filtered_USDA_DataSet <- filter(USDA_Production_Supply_And_Distribution, Year >= 1998 & Year <= 2012 & Attribute_Description == 'Production' & Country_Code == 'US')


In [5]:
# Exporting the FilteredDataSet to a CSV to share with the team.

write.csv(Filtered_USDA_DataSet, "../Data/Test_Filtered_USDA_DataSet.csv")


In [6]:

# Appending HoneyProduction Data to Filtered USDA Data Set


In [7]:
# Loading data

# Reading honeyproduction CSV file

honeyproduction <- read_csv("../Data/honeyproduction.csv")

[1mRows: [22m[34m626[39m [1mColumns: [22m[34m8[39m

[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (1): state
[32mdbl[39m (7): numcol, yieldpercol, totalprod, stocks, priceperlb, prodvalue, year


[36mℹ[39m Use [30m[47m[30m[47m`spec()`[47m[30m[49m[39m to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set [30m[47m[30m[47m`show_col_types = FALSE`[47m[30m[49m[39m to quiet this message.



In [8]:
# Assigning to a new dataset 

honeyproductionDS <- honeyproduction

In [9]:
# Grouping the Honey Production by year and adding new columns to match the USDA Production DataFrame

Grouped_Honey_DS = honeyproductionDS %>% group_by(year)  %>%
  summarise(Value = sum(totalprod), Country_Name = "United States",Commodity_Description = "Honey", Country_Code = "US",
            Attribute_Description = "Production", Unit_Description = "(LBS)", Seen_On = "1",
            .groups = 'drop')


In [10]:
# Changing year to Year
colnames(Grouped_Honey_DS)[1] <- "Year"


In [11]:
# Subsetting the USDA Dataframe to drop Index column.
Filtered_USDA_DF <- subset(Filtered_USDA_DataSet, select=c(2:9))


In [12]:
# Appending the Honey( grouped by year) dataset to USDA Dataset

Appended_Data_USDAHoney = rbind(Filtered_USDA_DF, Grouped_Honey_DS)


In [13]:
# See if it worked, checking last 20 rows.
View(tail(Appended_Data_USDAHoney, n = 20))


Commodity_Description,Country_Code,Country_Name,Year,Attribute_Description,Unit_Description,Value,Seen_On
<chr>,<chr>,<chr>,<dbl>,<chr>,<chr>,<dbl>,<chr>
"Walnuts, Inshell Basis",US,United States,2008,Production,(MT),395533,202110
"Walnuts, Inshell Basis",US,United States,2009,Production,(MT),396440,202110
"Walnuts, Inshell Basis",US,United States,2010,Production,(MT),457221,202110
"Walnuts, Inshell Basis",US,United States,2011,Production,(MT),418212,202110
"Walnuts, Inshell Basis",US,United States,2012,Production,(MT),450871,202110
Honey,US,United States,1998,Production,(LBS),219519000,1
Honey,US,United States,1999,Production,(LBS),202387000,1
Honey,US,United States,2000,Production,(LBS),219558000,1
Honey,US,United States,2001,Production,(LBS),185748000,1
Honey,US,United States,2002,Production,(LBS),171265000,1


In [14]:
# Exporting the AppendedDataSet to a CSV to share with the team.
write.csv(Appended_Data_USDAHoney, "../Data/Test_Appended_Data_USDAHoney.csv")