# Correlation Between Women's Rights Indexes and Countries' Cuisine Rankings

In this project, I’m going to explore the correlation between a country's women's rights index and its ranking in world cuisine, according to TasteAtlas data.

The inspiration for this project comes from a stand-up show by comedian Andrew Schulz, where he humorously suggests that the culinary qualities of countries are inversely proportional to their women's rights index. You can watch the related stand-up show here: [Andrew Schulz Stand-up](https://www.youtube.com/watch?v=bHnfbGyoa6o). While the topic may be perceived as sensitive, my goal is to use it as a learning exercise in data analysis.

**_Goal and Hypothesis_**

The goal of this project is to see if there's any truth to the hypothesis that countries with higher women's rights indexes have worse food rankings. This hypothesis, though controversial, provides an interesting angle for data analysis.

**_Analysis Plan_**

Before researching the correlation between women's rights indexes and countries' cuisine rankings, I want to look at the women's rights index and read the data logically for pre-search. Correlation will be the final topic. The analysis will be structured as follows:
- Assessment of the status and progress of women's rights by country.
- Checking the cuisine rankings of these countries.
- Seeing if there's any correlation between women's rights and countries' food rankings.

**_A Quick Note_**

As I mentioned earlier, this project may be sensitive to some readers. My intention is not to make any insinuations or humiliate anyone. We live in a world where every event and phenomenon has positive and negative consequences, and I wanted to explore a side of this negative situation that can be positive. Although I am absolutely against violence against women and sexual discrimination, I believe that such walls have also been demolished today. However, this was not the case until approximately 20 years ago, and therefore this negative situation may have partially caused a positive result, as it is also mentioned in the stand-up show. Let's look at this together. Thanks for understanding!

It is not easy to find data on the ratings of countries' cuisines and people's preferences through surveys. In addition, the data I have found also includes the preferences of a very specific geography, since they are collected in certain regions. For this reason, I consider it right to use the TasteAtlas ranking system. But first, I need to explain how TasteAtlas decides those rankings.

Regarding this question, TasteAtlas made an announcement about how they rank countries by their cuisine. In summary, they said, "The methodology of TasteAtlas is seemingly simple. Visitors vote, and we tally the votes and publish the rankings. Our key mechanism is a system we developed that differentiates genuine from invalid votes." You can find the full explanation and the reason for the announcement at this link: [TasteAtlas Methodology](https://www.tasteatlas.com/the-largest-french-television-accused-tasteatlas-of-rigging-the-cheese-ranking-in-favor-of-italy#:~:text=The%20methodology%20of%20TasteAtlas%20is,differentiates%20genuine%20from%20invalid%20votes.)

For the reasons that can be understood from the description, TasteAtlas ratings have a more global perspective. Instead of using surveys such as people's preferences conducted only in the United States or Europe, I think it would be more accurate to use TasteAtlas, which receives its data from a broader audience.

![23-24 TasteAtlas](Plots/23-24%20TasteAtlas.jpg)

Let's keep the TasteAtlas data above. Now, let's look at women's rights in general and understand how indexes are calculated.

We have two different datasets for the women's rights index by country. One of them is from the Georgetown Institute for Women, Peace and Security (GIWPS), which is more detailed. The other one is the Women's Civil Rights Index from Our World in Data (Source: V-DEM). You can find more information at these links: [GIWPS](https://giwps.georgetown.edu/the-index/) and [Our World In Data](https://ourworldindata.org/grapher/women-civil-liberties-index?tab=table).

In this research, I'm going to use the GIWPS index as my main data source because it's more detailed and also more updated. The Our World In Data (V-DEM) data provides an estimate as an average. On the other hand, the WPS index includes a summary of this data and uses it to create many documents and plots. You can check that summary if you're interested [here](https://giwps.georgetown.edu/wp-content/uploads/2023/10/WPS-Index-executive-summary.pdf). But I might use V-DEM data if I need to check countries' women's rights indexes by year. We'll see.

# 1. WPS (Women, Peace and Security Index) Overview

**_FIGURE 1_**: WPS Index Indicators

![WPS Index Indicators](Plots/wps.png)

The 13 indicators used to calculate the WPS Index are shown in the image under 3 groups. However, in this research, I will filter the indicators that I deem necessary alongside the WPS Index and use them. I will refer to these specific indicators if necessary. But first, let's look at the countries with the highest and lowest index scores.

**_FIGURE 2_**: The dozen best and worst performers on the WPS Index

![country_ranking](Plots/country_ranking.png)

As shown in _Figure 1_, while the gap between the top 15 countries is small (0.04), the gap between the bottom 15 countries is much larger (0.17). In light of this basic information, searching for a correlation between countries' index rankings and cuisine rankings is not likely to yield clear results. To avoid this potential misinterpretation, I will categorize countries by their indexes and conduct my research based on these categories, rather than examining countries individually. I believe that small gaps (e.g., 0.001 or less) between countries should not negatively affect the correlation. But how are we going to categorize the countries?

To decide how to categorize countries, let's check how GIWPS categorized those countries by regions.

**_FIGURE 3_**: WPS by Regions

![WPS_by_Region](Plots/WPS_by_Region.png)

So after grouping those countries by their indexes by region, what do those regions mean? Let's check that first.

**Country groups and regions**:
- Developed countries	
- Central and East Europe and Central Asia	
- East Asia and the Pacific	
- Latin America and the Caribbean	
- Middle East and North Africa	
- South Asia	
- Sub-Saharan Africa	


**_FIGURE 4_**: Region's min and max countries by WPS

![min_max_countries](Plots/min_max_countries.png)


Regarding _Figure 4_, the table includes regions' minimum and maximum WPS indexes by country. According to the plots, DC countries have a higher WPS average than others, including CEE&CA regions. However, comparing indexes and TasteAtlas data using these region groupings would not be effectively correct. For example, the minimum WPS index in the DC region is Israel with 0.703, while the maximum in the CEE&CA region is Estonia with 0.892. Similarly, Taiwan has 0.818 in the EAP region, Barbados has 0.779 in the LAC region, the United Arab Emirates has 0.868 in the MENA region, Sri Lanka has 0.743 in the SA region, and Seychelles has 0.799 in the SSA region as their maximum countries. 

My point is that while the DC region seems to top this table, Israel has a worse index compared to the maximum countries in other regions, despite being in the DC region. Therefore, I believe that comparing countries by these regions and checking the correlation with TasteAtlas data seems incorrect. Instead, let's first check all countries as a bar plot to see where the index breaking points are. After that, we can split countries by only looking at their indexes and then examine the correlation between countries' cuisines.

**_FIGURE 5_**: Distribution of all countries by WPS

In this distribution plot, the countries are categorized based on their WPS (Women, Peace and Security) indexes. The colors represent different WPS index levels, where higher indexes are indicated by darker shades. The plot shows that there are no clear breaking points between the categories, leading to the decision to use an alternative categorization method, such as the Jenks natural breaks algorithm.

![WPS_all_countries](Plots/WPS_all_countries.png)


As you can see from this plot, there are no specific and clearly understandable breaking points. Because of that, I have two options. The first is to split all countries into quartiles, or use the Jenks natural breaks algorithm with one of the Python libraries called 'mapclassify'.

To split these countries, I believe the most ideal approach is to divide them into 7 groups, which corresponds to the number of regions. This way, we won't be splitting them by their geographic locations, but by their WPS indexes into 7 categories. Let's check that out!

**_FIGURE 6_**: Countries distribution after categorizing them by the Jenks algorithm

This plot categorizes countries using the Jenks natural breaks algorithm, which aims to minimize variance within categories while maximizing variance between them. The colors represent the categorized WPS indexes, with darker shades indicating higher index levels. This categorization approach better highlights the differences between countries based on their WPS scores.

![wps_categories_jenks](Plots/wps_categories_jenks.png)


According to these categories, let's check the countries which have the highest WPS in their regions mentioned before.
- Israel: Medium
- Estonia: Top
- Taiwan: High
- Barbados: Medium-High
- United Arab Emirates: Top
- Sri Lanka: Medium-High
- Seychelles: High

This indicates that while Israel is in the DC region, which is at the top of the table by regions as average WPS index, it is in the medium category right now. The same applies to other countries. So from now on, I believe it’s time to look at TasteAtlas data and the correlation between those categories and countries’ cuisine ratings.

# 2. Taste Atlas Data Overview
![23-24 TasteAtlas](Plots/23-24%20TasteAtlas.jpg)

Before starting the project, I already mentioned how this ranking system is made. By the ratings of the countries' cuisines, some countries have the same rating (like Italy and Japan) and some of them have a really small gap between other countries. In that case, I believe the best method is to split these countries by their ranks into 7 categories, like we already did for WPS. After that, before checking the correlation between WPS categories and food ranking categories, we're going to check countries' rankings on the visual map to see the distribution on earth. Let's continue!

So now, let's see a heatmap of countries by their ratings.

**_FIGURE 7_**: Heatmap of countries by cuisine ratings


![food_rank_heatmap](Plots/food_rank_heatmap.png)


**_FIGURE 8_**: Countries distribution after categorizing them by cuisine ratings

![cuisine_categories_barplot](Plots/cuisine_categories_barplot.png)


While the color is getting reddish, it means that the country has a higher rating. While the African continent has fewer countries on this list, almost all of the American continent is on the list, and looking at their colors, it seems they also have high ratings. For example, Mexico is 7th, Peru is 10th, Brazil is 12th, Argentina is 14th, and the USA is 16th. So without looking at all countries' cuisines specifically, if we needed to read this table with only this information, I think we can say that because this continent was mostly built by Europeans, they took their culture to America. For example, the USA doesn't have a history like Greece or France. Canada is also the same.

So the question is, if Europeans took their food culture there, how did food culture spread in Europe, Asia, or Africa? Is there any connection between those countries which are in the top 100 cuisines, or is it all about coincidence?

**_FIGURE 9_**: The Silk Road

This map illustrates the historical Silk Road, a major trade network that connected the East and West. The red path highlights the route along which goods, cultures, and culinary traditions were exchanged. This context is important when analyzing the correlation between regions along the Silk Road and their cuisine quality ratings, as the trade route played a significant role in spreading culinary practices across these regions.

![Silk Road](Plots/Silk%20Road.png)

Silk Road, ancient trade route, linking China with the West, that carried goods and ideas between the two great civilizations of Rome and China. Silk went westward, and wools, gold, and silver went east. China also received Nestorian Christianity and Buddhism (from India) via the Silk Road. [The Silk Road|Britannica](https://www.britannica.com/topic/Silk-Road-trade-route)

According to the Silk Road route, it seems like there is a correlation between the road and cuisine ratings by country. The Silk Road was a major trade network that connected the East and West, facilitating not only the exchange of goods like spices, silk, and ceramics but also culinary practices, techniques, and ingredients. As traders, travelers, and settlers moved along the Silk Road, they brought with them their food culture, which blended with local traditions and created diverse and rich culinary practices in the regions along the route. For example, countries like Italy, China, Turkey, and Iran, which were significant stops on the Silk Road, today boast some of the world's most highly rated cuisines according to TasteAtlas. The migration of ingredients such as spices from Asia to Europe and cooking methods shared across these regions contributed to the development of distinctive yet interconnected cuisines. This historical exchange might explain why many countries along the Silk Road now have globally recognized and highly appreciated food cultures. By examining these connections, we can see how trade and cultural interaction have directly influenced the culinary excellence observed today.

During the colonization era, European settlers brought their cultural practices, lifestyles, and, most notably, their culinary traditions with them to the Americas. Italian, Spanish, French, and British immigrants transferred their home cuisines across the Atlantic, where they adapted to local ingredients and created new culinary blends. For example, Italian dishes like pasta and pizza evolved in America and became staples of American culture. Similarly, in Latin America, we see the strong influence of Spanish and Portuguese cuisines. The Mediterranean diet, with its emphasis on olive oil, vegetables, and grains, also found its place in the New World through these migration patterns. The Americas thus became a melting pot of immigrant food cultures. Consequently, the European culinary influences—many of which trace back to the Silk Road—formed a key foundation for the rich and diverse food landscape seen in the Americas today.

**Insights on Categorization and p-Value Interpretation:**

When working with different categorization methods (7, 4, and 3 categories), the results varied in their significance. In cases where the p-value was close to the threshold (e.g., 0.06), it suggests that while the relationship is not strictly statistically significant, there could be a weak correlation that might be better captured with refined methods or more data. In statistical analysis, when the p-value is near 0.05, it is crucial to explore potential underlying factors or consider adjustments in categorization, as done here, to better understand these subtle connections.

Ultimately, simplifying the categorization to 3 groups revealed a trend closer to significance, which could indicate an underlying pattern, though it remains weak.

# 3. Correlation Between Countries' Cuisines and WPS

Before starting that topic, here is a heatmap of WPS on the earth plotted by WPS indexes.

**_FIGURE 9_**: Heatmap of countries by WPS indexes

![WPS_heatmap](Plots/WPS_heatmap.png)

From the perspective of this project, my expectation is that countries in this WPS heatmap that are whitish should be reddish in the food rating heatmap. It is not clearly understood with the naked eye, but Turkey, Iran, India, China, Brazil, Peru, and Algeria fit this description. However, visually inspecting both heatmaps alone is not sufficient to derive clear conclusions. To see the correlation between WPS country categories and TasteAtlas country categories, which I made by their indexes and ratings, I will calculate the correlation in 3 different ways. First, because there are 7 regions, I have already split the categories into 7. But because this is too much for categorizing, I will then split the country categories into 4 and finally into 3. As a result, we will get 3 different outcomes.

**1. Seven category correlation result:**

I split countries by WPS indexes and TasteAtlas ratings from different datasets. Then I merged both by using country names. As a result, using a chi-square test, I aimed to determine whether there is a statistically significant relationship between the WPS categories and culinary quality categories. The chi-square test is ideal for categorical data, as it helps assess whether distributions of categorical variables differ from each other. So for all three results, I've been using this method.

After running the analysis, the results were clear:

Chi-Square Test Results
- Chi2 Value: 44.50
- p-Value: 0.156
- The p-value is greater than the common threshold of 0.05, suggesting no statistically significant relationship between these two categories.

**Understanding the Results:**

The contingency table, which shows the frequency distribution of WPS and culinary quality categories, provides some interesting observations, but none indicate a strong or meaningful relationship. For instance, countries categorized as "Top" in WPS are distributed across various culinary quality levels, from "Medium-Low" to "Top," without a consistent pattern.

The expected values table, which represents what the distribution would look like if there were no relationship between WPS and culinary quality, shows minimal deviations from the observed values.

**Conclusion:**

The analysis, using 7 categories, suggests no significant correlation between a country’s women’s rights index and the quality of its cuisine. The p-value indicates that any observed differences in the data are likely due to random variation rather than a systematic relationship. This reinforces the idea that not all social factors align in predictable ways. Sometimes, the most valuable insights are those that challenge our assumptions and encourage deeper exploration.

**2. Four category correlation result:**

Chi-Square Test Results
- Chi2 Value: 15.10
- p-Value: 0.088
- The p-value is slightly above the common threshold of 0.05, suggesting that there is still no statistically significant relationship between these two categories.

**Understanding the Results:**

The contingency table, which displays the frequency distribution of WPS and culinary quality categories, reveals some interesting distributions, yet nothing definitive enough to suggest a meaningful connection. For example, countries categorized as "Top" in WPS exhibit a diverse range of culinary quality levels, from "Medium-Low" to "Top," without a clear or consistent pattern.

The expected values table, which reflects what the distribution would look like if there were no relationship between WPS and culinary quality, shows only modest deviations from the observed values.

**Conclusion:**

Despite revising the categories from 7 to 4, the analysis continues to suggest no significant correlation between a country’s women's rights index and its cuisine quality. The p-value indicates that the observed differences are likely due to random variation rather than an underlying relationship. This result emphasizes that certain societal metrics may not correlate as intuitively as one might expect. Even when refining categories, it is crucial to remain open to outcomes that challenge preconceived notions.

**3. Three category correlation result:**

Chi-Square Test Results
- Chi2 Value: 8.69
- p-Value: 0.069
- The p-value is slightly above the common threshold of 0.05, suggesting a potential relationship, though it is not statistically significant.

**Understanding the Results:**

The contingency table shows the distribution of WPS and culinary quality categories across 3 groups. Although there are some notable patterns, such as countries in the "High" WPS category being more concentrated in the "Medium" and "High" culinary quality levels, the overall distribution doesn’t provide conclusive evidence of a strong relationship.

The expected values table, which reflects what the distribution would look like under the assumption of no relationship, shows some differences when compared to the observed values, but these differences are not substantial enough to confirm a meaningful connection.

**Conclusion:**

With the countries categorized into 3 groups, the analysis indicates a potential but not statistically significant correlation between a country’s women’s rights index and its cuisine quality. The p-value suggests that the observed variations may still be due to chance. While the results hint at a relationship, they also demonstrate that certain factors might only show weak connections or correlations, depending on the categorization approach used. This serves as a reminder of the importance of testing various categorization methods in complex analyses.

# Final Analysis and Interpretation

**_FIGURE 10_**: Both WPS/TasteAtlas Categories heatmap split into 3 categories (merged WPS and TasteAtlas countries)

![both_heatmap_splited_3_categories](Plots/both_heatmap_splited_3_categories.png)


After splitting categories into three instead of seven or four, the correlation is much higher than others. Also, in light of the two heatmap plots above, the TasteAtlas rating categories are shown more clearly, resembling the Silk Road (Red color indicates high categories/indexes, oranges indicate middle, and grays are the lowest). So, the first plot fits my assumption that the Silk Road carried food culture, which is why the red path from China to Europe looks like the Silk Road.

On the other hand, while most of Europe is red on the WPS heatmap, most of the Silk Road is gray except for some Arabic countries. If we check countries one by one, some countries fit my project assumption. For example, while Brazil, China, Indonesia, and Turkey are red on the TasteAtlas heatmap, they are gray on the WPS heatmap. Conversely, Australia, Austria, Canada, Czechia, Germany, and Slovakia are red on the WPS heatmap while they are gray on the TasteAtlas categories heatmap. However, to be more clear and get more effective results, we need to read numeric results instead of checking countries one by one with the naked eye.

In this analysis, we aimed to explore the correlation between a country’s Women, Peace, and Security (WPS) index and the quality of its cuisine. We categorized the countries into 7, 4, and 3 categories to see if there was any significant relationship between these two variables.

**7 Categories:** Initially, we divided the countries into 7 categories. The chi-square test resulted in a p-value of 0.156, indicating no statistically significant relationship. The observed distributions across categories did not show any consistent pattern, suggesting that the variables might be independent.

**4 Categories:** Next, we reduced the categorization to 4 groups. The p-value improved slightly to 0.088, but it was still above the threshold of 0.05. This suggests that while there might be some relationship, it is not statistically strong enough to be considered significant.

**3 Categories:** Finally, we grouped the countries into 3 categories. The p-value dropped further to 0.069, coming very close to the 0.05 threshold. Although it is technically not statistically significant, this result is noteworthy because it indicates a potential relationship that is not strong but may still be present.

Considering the sample size of 95 countries, which is generally adequate, the decreasing p-values as we simplified the categories suggest that there might be a weak relationship between a country's WPS index and its culinary quality. While the evidence is not strong enough to make a definitive conclusion, it is reasonable to say that a weak but existing correlation could be present.

In summary, while the correlation between these variables is not robust, the analysis hints at a potential connection that warrants further investigation. Simplifying the categorization helped to reveal this subtle relationship, indicating that factors like women's rights might have some influence, albeit limited, on the culinary quality of a country.