# Vietnamese food landscape in California

**Hai Le**

Github repo: https://github.com/HaiVuLe/Caliphonia-dataviz

## Introduction

"Oh, I'm a big fan of Vietnamese food. Do you know of some good Vietnamese restaurants here?" is that question I often get asked as I introduce myself as moving to San Francisco from Vietnam, my home country, to attend a master's program in data science. Well, for now, I know more about data science than about Vietnamese food in San Francisco. But wait, though I don't have much personal experience to draw from, I can still be of some help by using my data skills!

Also, Vietnam food is what I am very proud of about my country. I would love to see for myself how it is received, and hopefully loved, by people from all around the world. Understanding the food landscape in California, on another hand, would give me some insights into the life of the Vietnamese population here in the state where Vietnamese has the biggest presence in the entire country. This is a topic I personally care about.

That's why I collected data from Yelp by myself using Yelp APIs to put together a dataset about Vietnamese restaurants in California. The data are search results returned by Yelp when querying "Vietnamese Restaurant in `{city}`, CA" with `city` being the each of every city in California.

Besides, to enrich and contextualize my analysis, I used data of [population of Californian cities from the US Census Bureau (2018)](https://www.california-demographics.com/cities_by_population), data of cities in the United States [(ESRI)](https://www.esri.com/en-us/home), and data of Vietnamese population by counties from the report ["The Vietnamese Population in the United States: 2010"](http://www.vasummit2011.org/docs/research/The%20Vietnamese%20Population%202010_July%202.2011.pdf) published at VA Summit (2011).

A few questions I would like to ask this dataset include (1) where Vietnamese restaurants are, whether they are concentrated in areas with high representation of Vietnamese population or well spread across the state, (2) who are Vietnamese food consumers, is Vietnamese food loved by non-Vietnamese foodies too, and (3) what food is served and popular, and what is the quality of Vietnamese food here.

## Summary of Data

### Availability

The very first thing I seek to understand is "Where Vietnamese food can be found". So I generated a choropleth map at the county level, which lays out an overview of the availability of Vietnamese food. 

The first interesting finding is that out of 58 Californian counties, there are only 36 counties that have Vietnamese restaurants. Most counties where Vietnamese cuisine is not an dining option are in the east and the north of the state. This finding is quite surprising to me, given that Washington ranks third among the top 10 states with largest Vietnamese population (VA Summit, 2011).

<center><img src="images/Chloropleth-what-counties.png" width="85%"/></center>

To double check the correctness of my dataset, I went on to Yelp website and searched for Vietnamese Restaurants in Inyo County and indeed found no results. As I looked into population facts about Inyo Country, it turned out there are only 13 Vietnamese people in the entire county according to the US Census Bureau (2010)!

<center><img src="images/no-restaurants.png" width="75%"/></center>

The second interesting fact that Orange County is the absolute winner by a far margin. While this may not be so surprising given that the Vietnamese community is highly concentrated in this county, considering its geographical area, it shows how surprisingly densely Vietnamese restaurants are located here. Now I'd love to travel there to see for myself if Vietnamese food is right around the corner as my data shows.

While the cholopleth map San Francisco county clearly highlights how densely concentrated Vietnamese eateries are in Orange County, it does not picture San Francisco - the city I have called home for the past 11 months - in commensurate light given how tiny San Francisco is area wise. In fact, despite being the small size, we have great access to Vietnamese food, with 154 restaurants in total. Comparing that number with the equivalent in Monterey, a nearby neighboring county, shows how greatly available the famous pho and banh mi are here. Yay!

<center><img src="images/Choropleth-SF-in-splotlight.png" width="85%"/></center>

What about in city level? Will a city-level map tell us another story? Is it possible that there are some cities that have good availability of Vietnamese food but are located in counties where all neighboring cities do not enjoy Vietnamese food?

This chart shows the top 10 cities with the largest number of Vietnamese restaurants. It is first worth noting that these 10 cities host $1,099$ out of the total of $2,322$ restaurants or $50\%$ the total number of Vietnamese restaurants in the entire California!

<center><img src="images/Top 10 cities.png" width="85%"/></center>

All these 10 cities belong to the top counties. It turns out that Orange County is home to 4 of the top 10 cities!

<center><img src="images/BubbleMap-where-are-top-10-cities.png" width="75%"/></center>

Looking at the number of restaurants per thousand people reveals another interesting fact. Spring Valley and Mountain View, despite having a humble number of Vietnamese restaurants, offer their citizens with good accessibility to yummy healthy Vietnamse food. 

And kudos to Westminster. Not only being in top 3 cities with the largest number of restaurants, it also makes it in top 3 in terms of making Vietnamese cuisine a very reachable to its people.

<center><img src="images/Treemap-Resto-per-Capita-2.png" width="85%"/></center>

### Quality

Now, let's take a look at how good Vietnamese food is, according to Yelp users. I hope to see more restaurants receiving 5-star rating - currently only more than $2\%$ of restaurants completely win their customer hearts (in Vietnamese, we have a saying that is the quickest way to the heart is through the stomach :) ). In overall, I would say that Vietnamese restaurants in general are satisfactory though not exceptional.

In [62]:
df['rating'].iplot(
    kind='histogram', histnorm='percent', opacity=0.8,
    xTitle= 'Rating', yTitle= "Percentage of total number of restaurants",
    title='Rating of Vietnamese restaurants in California by Yelp users', 
    filename='images/Histogram-review-count')

There is a fair amount of inconsistency in the quality of restaurants, more so in some cities than in others. Below is the box plot showing the rating of restaurants in the top 10 cities with the largest number of Vietnamese restaurants. 

Big fans of Vietnamse food should definitely check out San Jose, Santa Ana, Sacramento, Los Angeles, or Garden grove. As the data shows, finding amazing Vietnamese restaurants here is not that hard. At the same time, make sure to check Yelp carefully before you go because in these cities, the quality of food varies to a large extent too.

Speaking of San Francisco, I'm happy to see that the overall quality of Vietnamese food is among the highest. However, the chance to find a really outstanding restaurant is slim. 


<center><img src="images/Boxplot - rating by cities.png" width="75%"/></center>

### Interaction with consumers

Finally, I want to know how much feedback the restaurants receive to improve their quality, and hopefully, to move up towards the 4.5 and 5 star rating camps.

Unluckily, a large number of restaurants do not receive much feedback. Up to 20% of restaurants hear back from fewer than 50 customers and more than 50% restaurants have fewer than 200 reviews. They should definitely ask to hear more opinions from their customers. 

There is an exceptional case of a restaurant that receives more than 5000 reviews! Maybe that could be a good role model for those 50% restaurants to learn from. 

In [65]:
df = pd.read_csv('data/vietnamese_restaurant_CA_clean.csv')

In [64]:
df['review_count'].iplot(
    kind='histogram', histnorm='percent', bins=30,
    xTitle= 'Number of reviews', yTitle= "Number of restaurants",
    title='How much care restaurants receive from their Yelp customers')

Interestingly enough, this is The Slanted Door in San Francisco.

<center><img src="images/Boxplot - reviews by cities.png" width="75%"/></center>

# Conclusion

Vietnamese food is common yet lacks presence in a large part of California. Restaurants are highly concentrated in where the Vietnamese population is the biggest. Yet interesting, regarding the number of restaurants per capita, Spring Valley and Mountain View - despite being home of only a handful of restaurants, actually are where citizens have the most access to Vietnamese food on average.

Unluckily, the quality of Vietnamese food is not yet exceptional and many of the restaurants do not receive much feedback from their customers to improve themselves. Foodies in San Francisco and San Diego tend to be highly vocal of their experience, which possibly helps restaurants here meet their needs better, hence the overall high quality across restaurants in these two cities.  

# Citations

United States Census Bureau. (2011, June 1). The Vietnamese Population in the United States: 2010. Retrieved from http://www.vasummit2011.org/: http://www.vasummit2011.org/docs/research/The%20Vietnamese%20Population%202010_July%202.2011.pdf