<h1>1. Introduction to the research space</h1>

<h3>1.1 Summary of research area</h3>

<p>This research is</p>

<h3>1.2 Aims and objectives</h3>
<p>This research project aims to understand the general trend of the food security in Singapore by analysing the supply of live stock. While this may not be an accurate representation of the general food supply in Singapore, however, it offers a view of the local food security. Ideally, this research could help policymakers to understand the changes in the supply of meat and aid them in making policies concerning food. The research question is how the supply of live stock for food in Singapore changes over time.</p>

<h3>1.3 Acquire of dataset</h3>
<p>The information regarding the historical data on the Livestock Slaughtered was acquired from the official website of the Singapore Department of Statistics (SingStats) via web API. The code for web scrapping is shown below.</p>

In [1]:
#importing the relevent library
import requests
import json
import pandas as pd

<p>I have imported the above libraries for these following reasons:</p>
<p>requests library is used to initiate HTTP get request to the website so that the python program is able to scrape for the data</p>
<p>json library is used to load the data scraped from the website into text file so that it is readable by other parts of the program</p>
<p>pandas library is used as it is a common data analytic tool to process and handle data frames in python. It is used to write to and read from csv files that saves the data. It also handles and processes the data to a proper format.</p> 

In [2]:
#function to return the url for the data
def getRequesturl():
    url = "https://tablebuilder.singstat.gov.sg/api/table/tabledata/M890521"
    return url

In [3]:
#functions to scrap the singstats website for the data
def getApiData(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    poultry =  data["Data"]["row"][0]["columns"]
    chicken = data["Data"]["row"][1]["columns"]
    duck = data["Data"]["row"][2]["columns"]
    quail = data["Data"]["row"][2]["columns"]
    pigs = data["Data"]["row"][2]["columns"]

    return poultry, chicken, duck, quail, pigs

def getApiDataPoultry(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    poultry =  data["Data"]["row"][0]["columns"]
    return poultry

def getApiDataChickens(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    chickens =  data["Data"]["row"][1]["columns"]
    return chickens

def getApiDataDucks(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    ducks =  data["Data"]["row"][2]["columns"]
    return ducks

def getApiDataQuails(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    quails =  data["Data"]["row"][3]["columns"]
    return quails

def getApiDataPigs(requestUrl):
    response = requests.get(requestUrl)
    data = json.loads(response.text)
    pigs =  data["Data"]["row"][4]["columns"]
    return pigs

In [4]:
#scraping for poultry
requestUrl = getRequesturl()

poultry = getApiDataPoultry(requestUrl)
dfPoul = pd.DataFrame(poultry)
# dfPoul #uncomment to view the dataframe

In [5]:
#scraping for chicken
chicken = getApiDataChickens(requestUrl)
dfChick = pd.DataFrame(chicken)
# dfchick #uncomment to view the dataframe

In [6]:
#scraping for Duck
ducks = getApiDataDucks(requestUrl)
dfDuck = pd.DataFrame(ducks)
# dfduck #uncomment to view the dataframe

In [7]:
#scraping for Quails
quail = getApiDataQuails(requestUrl)
dfQuail = pd.DataFrame(quail)
# dfquail #uncomment to view the dataframe

In [8]:
#scraping for pig
pigs = getApiDataPigs(requestUrl)
dfPig = pd.DataFrame(pigs)
# dfpig #uncomment to view the dataframe

<p> The data acquired are saved in CSV files. (note, in case the website brokes down or prohibits web scrapping in the future, a backup of the files is available in the data folder. At the time of writing, both copies of data are identical.) Please run the code that is commended out below to continue if the website is unavaliable.</p>

In [9]:
#save to csv
dfPoul.to_csv("poul.csv")
dfChick.to_csv("chick.csv")
dfDuck.to_csv("duck.csv")
dfQuail.to_csv("quail.csv")
dfPig.to_csv("pig.csv")

########## The code below are for the backup incase the website fails###########
# #read from saved files
# dfPoul=pd.read_csv("data/poul.csv")
# dfChick=pd.read_csv("data/chick.csv")
# dfDuck=pd.read_csv("data/duck.csv")
# dfQuail=pd.read_csv("data/quail.csv")
# dfPig=pd.read_csv("data/pig.csv")

</br>

<h3> 1.4 Exploratory data analysis</h3>

<p>In the exploratory data analysis, I am finding the maximum value of the particular Livestock Slaughtered and the corresponding year. I am also finding the possible none values.</p>

<img src="img/notation.jpg" alt="original data in singstats" width="200" height="200">

<p>The picture above is a screen shot of the Singstats, it mentions the notation they use when the data is missing. With referece to this picture, the possible none values to be find will be "na", "nec", "nes" and "-"</p>

<h4>Exploratory data analysis for Chicken:</h4>

In [10]:
max=dfChick[dfChick['value']==dfChick['value'].max()]#check for max value
min_year=dfChick[dfChick['key']==dfChick['key'].min()]#check for starting year


none1=dfChick[dfChick.isna().any(axis=1)]#check for na value
none2=dfChick.loc[dfChick['value'].isin(['na', 'nec', 'nes', '-'])]#possible none values

#print the output
print("max number slaughtered:")
display(max)

print("starting year:")
display(min_year)

print("none value(s):")
display(none1,none2)

max number slaughtered:


Unnamed: 0,key,value
25,2018,51400


starting year:


Unnamed: 0,key,value
0,1993,35506


none value(s):


Unnamed: 0,key,value


Unnamed: 0,key,value


<h4>Exploratory data analysis for Duck:</h4>

In [11]:
max=dfDuck[dfDuck['value']==dfDuck['value'].max()]#check for max value
min_year=dfDuck[dfDuck['key']==dfDuck['key'].min()]#check for starting year

none1=dfDuck[dfDuck.isna().any(axis=1)]#check for na value
none2=dfDuck.loc[dfDuck['value'].isin(['na', 'nec', 'nes', '-'])]#possible none values

#print the output
print("max number slaughtered:")
display(max)

print("starting year:")
display(min_year)

print("none value(s):")
display(none1,none2)

max number slaughtered:


Unnamed: 0,key,value
7,2000,7428


starting year:


Unnamed: 0,key,value
0,1993,6318


none value(s):


Unnamed: 0,key,value


Unnamed: 0,key,value


<h4>Exploratory data analysis for Quail:</h4>

In [12]:
max=dfQuail[dfQuail['value']==dfQuail['value'].max()]#check for max value
min_year=dfQuail[dfQuail['key']==dfQuail['key'].min()]#check for starting year


none1=dfQuail[dfQuail.isna().any(axis=1)]#check for na value
none2=dfQuail.loc[dfQuail['value'].isin(['na', 'nec', 'nes', '-'])]#possible none values

#print the output
print("max number slaughtered:")
display(max)

print("starting year:")
display(min_year)

print("none value(s):")
display(none1,none2)

max number slaughtered:


Unnamed: 0,key,value
7,2019,85


starting year:


Unnamed: 0,key,value
0,2012,80


none value(s):


Unnamed: 0,key,value


Unnamed: 0,key,value


<h4>Exploratory data analysis for Pigs</h4>

In [13]:
max=dfPig[dfPig['value']==dfPig['value'].max()]#check for max value
min_year=dfPig[dfPig['key']==dfPig['key'].min()]#check for starting year


none1=dfPig[dfPig.isna().any(axis=1)]#check for na value
none2=dfPig.loc[dfPig['value'].isin(['na', 'nec', 'nes', '-'])]#possible none values

#print the output
print("max number slaughtered:")
display(max)

print("starting year:")
display(min_year)

print("none value(s):")
display(none1,none2)

max number slaughtered:


Unnamed: 0,key,value
5,2020,431


starting year:


Unnamed: 0,key,value
0,2015,334


none value(s):


Unnamed: 0,key,value


Unnamed: 0,key,value


<p>From the analysis above, I can conclude that the number of livestock slaughtered is generally in an increasing trend(except for duck). There is no na or none values in any of the data gathered, however, the data starts at different year. </p>

</br>

<h1>2. Justification of the relevance of data to the aims/objective and use of data source</h1>

<h3>2.1 Origin of the data</h3>

<p>The data is originate from the official website of the Singapore Department of Statistics (<a href="https://www.singstat.gov.sg/)">link to Singstat</a>). It offers a service called the SingStat Table Builder where the statistics in Singapore are displayed. The data chosen was the yearly data of Livestock Slaughtered in Singapore (<a href="https://tablebuilder.singstat.gov.sg/table/TS/M890521)">link to data</a>).
    
Web API is used to retrieve the JSON(JavaScript Object Notation) file of the data. The .json file can be found <a href="https://tablebuilder.singstat.gov.sg/api/table/tabledata/M890521">here</a>. The method of retriving the similar to that of week 10's lecture. 
</p>

<h3>2.2 Appropriateness of the data</h3>

<p>This data is appropriate as it offers an overview of the food supply in Singapore in terms of meat. The data also covers majority sources of the meat consumed in Singapore. The Livestock Slaughtered shows an insight of the meat supply to the local market. The number of Livestock Slaughtered influence the supply of meat greatly as it is one of the main source of fresh meat avaliable locally. 
    
On the other hand, the data is published by the Singapore Department of Statistics citing the source as "AGRI-FOOD AND VETERINARY AUTHORITY, SINGAPORE FOOD AGENCY". All parties involved are part of the Singapore government, making the source credible.</p>

<h3>2.3 Case for working with this data</h3>

<img src="img/original.jpg" alt="original data in singstats" width="500" height="600">

<p>The image above shows the original data in the Singstats website. With reference to the research question mentioned in section 1.2. The rows concerning the analysis of the research questions are the Poultry, Chickens, Ducks, Quails and Pigs. The data in these rows shows the net number in thousands of the respective livestock slaughted. It answers the the changes in the supply of livestocks. The columns from 1993 to 2021 shows the time that the data is collected. This answers the part on how the data changes with respect to time.</p>

<p>As for the current dataframe used, I will use that of the poultry (dfPoul) as an example. The other dataframes are of similar structure.</p>

In [14]:
display(dfPoul)

Unnamed: 0,key,value
0,1993,41824
1,1994,43012
2,1995,37429
3,1996,42505
4,1997,45514
5,1998,47391
6,1999,52215
7,2000,50155
8,2001,50213
9,2002,51721


<p>The column heading "key" in the dataframe shown above refers to the year which the data is gathered. While the column heading "value" refers to the number of this particular livestock slaughtered. The current headings are identical for all other dataframes</p>

</br>

<h3>2.4 Formating the data </h3>

<p> The current five dataframes (dfPoul, dfChick, dfDuck, dfQuail, dfPig) will be combined into one dataframe for the easy of analysis. The dataframe dfPoul will be discarted as it is not a representative of a particular livestock.<p>

<p> I will rename the column of each dataframe to a unique name and then combine them into one dataframe. </p>

In [15]:
#dataframe changed and reindexed to regroup 
#renaming df for chicken
dfChick. rename(columns = {'key':'year', 'value':'chi_num'}, inplace = True)
dfChick=dfChick.set_index('year')

#renaming df for duck
dfDuck. rename(columns = {'key':'year', 'value':'duck_num'}, inplace = True)
dfDuck=dfDuck.set_index('year')

#renaming df for quail
dfQuail. rename(columns = {'key':'year', 'value':'qui_num'}, inplace = True)
dfQuail=dfQuail.set_index('year')

#renaming df for pig
dfPig. rename(columns = {'key':'year', 'value':'pig_num'}, inplace = True)
dfPig=dfPig.set_index('year')

In [16]:
#create new dataframe for everything
frames = [dfChick, dfDuck, dfQuail, dfPig]
df_livestock = pd.concat(frames,axis=1).reindex(dfChick.index)

#store data as csv
df_livestock.to_csv("livestock.csv",index=False)
display(df_livestock)

Unnamed: 0_level_0,chi_num,duck_num,qui_num,pig_num
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1993,35506,6318,,
1994,35956,7056,,
1995,31264,6166,,
1996,36312,6193,,
1997,38631,6884,,
1998,41124,6268,,
1999,44858,7357,,
2000,42727,7428,,
2001,43484,6729,,
2002,44768,6953,,


In [17]:
# download=pd.read_csv("data/M890521-table.csv")
# download

<p>The data is now in one panda dataframe which is suitable for analysis</p>

<h3> 2.5 Comparison of other data avaliable</h3>

<p>During the ideation phase of this project, I have also considered a range of other data sets available. In particular, the Singapore Food Statistics from Singapore Food Agency (<a href="https://www.sfa.gov.sg/files/SingaporeFoodStatistics/SFA_SingaporeFoodStatistics1/mobile/index.html">link to data</a>) and Local Production And Local Landings by the Department of Statistic Singapore(<a href="https://tablebuilder.singstat.gov.sg/table/TS/M890721)">link to data</a>). </p>

<p>The Singapore Food Statistics provides more in-depth analyzed data on food supply in Singapore. It contains more indicators to analyze and thus gives a bigger picture of the issue. However, it is a complete report on a yearly basis. It is difficult to get the data as the arrangement of the report differs year to year.</p>

<p>The Local Production And Local Landings are from the same authority in my analysis. It is equally detailed and credible. However, Singapore is a tiny city-state with little emphasis on local agriculture and farm production. The local production of food is insignificant compared to that of locally slaughtered livestock which is mostly imported.</p>

</br>

<h1>3. Background Analysis</h3>

<p>Singapore has always been a resource-poor country due to its limited size. She is heavily reliant on imports even daily necessities like food and water. The current supply chain disruption has worsened the matter as one of its main supplier of chicken, Malaysia, has decided to stop the export of the bird due to shortage in supply. Singapore has responded by importing chicken from other sources, however, the price of chicken and eggs still risen significantly locally.</p>
</br>
<p>Personally, I have felt the rise in food and beverages in eateries and school canteens. Thus I have decided to conduct a study on the supply of food in Singapore, in particular, the supply of meat.</p>

<h1>4. Exploration of data</h1>

<h3>4.1 removing illegal values</h3>

<h1>5. Ethics of use of data</h1>