# INTRODUCTION AND HISTORY

<br>
<h2 style = "font-size:40px; font-family:Garamond ; font-weight : bold; background-color:#d2691e; color :black   ; text-align: center; border-radius: 5px 5px; padding: 5px"> Role of Rituals in evolution of social complexity</h2> 
<br>

In [None]:
from IPython.display import Image
Image(filename="../input/rituals/Capture.PNG", width= 1300, height=1000)

<br>
<h2 style = "font-size:40px; font-family:Garamond ; font-weight : bold; background-color:#d2691e; color :black   ; text-align: center; border-radius: 5px 5px; padding: 5px"> What is social complexity and what are rituals?</h2> 
<br>
<strong>Social complexity</strong> is the study of the phenomena of human existence – emigration patterns, armed conflicts, political movements, marriage practices, natural disasters, etc, etc – and the many possible arrangements of relationships between those discrete phenomena.  Social complexity reflects human behavior as it is exercised in ongoing and increasingly broader and more complicated circumstances of individual and group existence. Social complexity has emerged as the conceptual and practical framework wherein these phenomena and their relationships can be studied. 

A <strong>ritual</strong> is a sequence of activities involving gestures, words, actions, or objects, performed in a sequestered place and according to a set sequence.Rituals may be prescribed by the traditions of a community, including a <strong>religious community</strong>. Rituals are characterized, but not defined, by formalism, traditionalism, invariance, rule-governance, sacral symbolism, and performance.

The origins of religion and ritual in humans have been the focus of centuries of thought in archaeology, anthropology, theology, evolutionary psychology and more. Play and ritual have many aspects in common, and ritual is a key component of the early cult practices that underlie the religious systems of the first complex societies in all parts of the world.

As a deep and enduring aspect of human societies and individual psychology, religion remains an evolutionary puzzle. Considering its centrality in so many lives, however, the scientific study of religion remains a largely neglected and fragmented topic.

In his classic works,<em> The Elementary Forms of the Religious Life </em>, Emile Durkheim(1965) examined the significance of ritual in society. He concluded that one of ritual's major functions is the integration of the individual into the group and the maintenance and vitalization of collective order and solidarity. The performance of ritual,according to this view, expresses the collective consciousness of the social group; hence participation in ritual action reinforces conformity to collective values.


<br>
<h2 style = "font-size:40px; font-family:Garamond ; font-weight : bold; background-color:#d2691e; color :black   ; text-align: center; border-radius: 5px 5px; padding: 5px"> About the dataset</h2> 
<br>

In one of the research papers which forms the basis of the dataset , researches have adopted a scientific method where they first gave 5 predictions based on the rationale of "Divergent  Modes  of  Religiosity" and then started compiling the SESHAT database



<div class = 'image'> <img style="float:center; width:95% ; border:5px solid #000000" align=center src = https://i0.wp.com/peterturchin.com/wp-content/uploads/2015/10/Ritual-infographic.png?ssl=1> 
</div>

# EXPLORATORY DATA ANALYSIS(EDA)

<br>
<h2 style = "font-size:40px; font-family:Garamond ; font-weight : bold; background-color:#d2691e; color :black   ; text-align: center; border-radius: 5px 5px; padding: 5px"> Analyzing the data</h2> 
<br>

**IMPORTING THE LIBRARIES AND DATA**

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import urllib.parse
import requests
import geopandas as gpd
from shapely.geometry import Point
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


In [None]:
df = pd.read_csv('../input/social-complexity-dataset-ancient-civilization/SocialComplexity3.csv', delimiter = ',', encoding = 'utf-8')
df.head(5)

**DESCRIPTION OF THE VARIABLES**

<div class = 'image'> <img style="float:center; width:95% " align=center src = https://www.pnas.org/content/pnas/115/2/E144/F2.large.jpg?width=800&height=600&carousel=1> 
</div>

<a href ="https://www.pnas.org/content/115/2/E144/tab-figures-data" title = "Figures and SI of the data " style = "font-size:20px,color: dimgrey, text-align:left,font-family:serif">Image Source: Figures and SI of the data </a>

**DATA ANALYSIS AND VISUALISATION**

In [None]:
df.info()

1. There are no null values in the dataset
2. Variables are mostly numeric and continous variables
3. NGA and PollID are the only categorical and object-type variables
4. SPC1 is an indicator of social complexity.

In [None]:
df.select_dtypes(exclude = 'object').describe()

None of the variables seem categorical.

In [None]:
df.select_dtypes(exclude= ['int64','float']).describe()

There are 30 unique NGAs( Natural Geographic Areas).We  call  them  ‘NGAs’  because  they  each  cover  a  geo-ecological  zone  that  has  retained  its  distinctive  character  over  the  millennia,  even  while  the  scale  and  structure  of  human  social  systems  have  changed,  often  quite  dramatically.  The NGAs  were  selected  to  cover  as  broad  a  range  as  possible  of  social,  cultural,  political, and economic variation in world history selected from Africa, Europe, the Pacific, the Americas, and all the major regions of Asia.

**Visualising the correlation between the variables**

In [None]:
df_ax=df.drop(columns=['MG_corr'])
corrmat=df_ax.corr()
fig = plt.figure(figsize = (12, 9))
sns.heatmap(corrmat, square = True,cmap="YlGnBu")
plt.show()

In [None]:
corrmat

1. Most of the parameters are quite correlated with each other,having a correlation greater than 0.5<br> 
2. Parameters other than time are highly correlated with each other<br> 
3. Writing and texts are quite highly correlated(0.945) with each other<br> 
4. SPC1 is highly correlated to all the other variables<br> 
5. Thus it can be concluded that social complexity can be easily represented by one feature.<br> 

This is the map of these NGAs , code for which was contributed by **bhuvanchennoju/Kaggle**

In [None]:
def lat_log_finder(place):
    address = place
    url = 'https://nominatim.openstreetmap.org/search/' + urllib.parse.quote(address) +'?format=json'
    response = requests.get(url).json()
    lat = response[0]["lat"]
    log = response[0]["lon"]
    return [lat, log]
NGA_list = df.NGA.unique().tolist()
NGA_corrected  =['Big Island Hawaii',     
 'Cahokia',
 'Cambodia',
 'Central Java',
 'Chuuk Islands',
 'Cuzco',
 'Narmada river, india',
 'Finger Lakes, New york, usa',
 'Garo Hills',
 'Ghana',
 'Iceland',
 'Kachhi, pakistan',
 'Kansai',
 'Kalimantan',
 'Konya Plain',
 'Latium',
 'Lena River Valley',
 'ecuador,south america',
 'Middle Yellow River Valley',
 'Niger River',
 'North Colombia',
 'Orkhon Valley',
 'Port Moresby',
 'Paris Basin',
 'Sogdiana',
 'zhaotong',
 'ahvaz,iran',
 'Upper Egypt',
 'Valley of Oaxaca',
 'sana, yemen']

NGA_corrected_dir = { key:val for key,val in zip(NGA_list,NGA_corrected)}

df['locations']= df['NGA'].map(NGA_corrected_dir).apply(lambda x: lat_log_finder(x))
df['latitude'] = df['locations'].apply(lambda x: x[0])
df['longitude'] = df['locations'].apply(lambda x: x[1])
df['latitude'] = df['latitude'].astype('float')
df['longitude'] = df['longitude'].astype('float')

## creating an point objects to make scatter  plot and locate  NGAs
crs = {'init':'epsg:4326'}
geometry = [Point(y,x) for x,y in zip(df['latitude'],df['longitude'])]
geo_df = gpd.GeoDataFrame(df.copy(),crs = crs, geometry = geometry)
geo_df.head(5)

colors= ['#EEB76B','#D8EBE4','#282846','#007580','#fed049']
#loading geodataframe
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world = world[world['name']!='Antarctica']

fig, ax  = plt.subplots(figsize = (20,15),dpi =100)

fig.patch.set_facecolor('#f6f5f5')
ax.set_facecolor('#f6f5f5')
world.dropna().plot( scheme='quantiles', 
                       k=10, legend = False, linewidth = 1,
                       ax = ax, color = colors[-2], alpha = 0.8)
geo_df.plot(color =colors[-1], marker = '^',ax = ax,markersize = 70)
loc_check = []
for loc,y,x in zip(df['NGA'], df['latitude'],df['longitude']):
    if loc in loc_check:
        continue
    if x < 20:
        ha = 'right'
        x_off = -4
        y_off = 1.5
    elif x>20:
        ha = 'left'
        x_off = 2
        y_off =0
    ax.annotate(loc, xy= (x,y),xytext= (x+x_off,y+y_off),
                    fontsize=11.5,fontfamily='serif',fontweight ='bold',ha=ha, va='bottom',color='black', alpha =0.8)
    loc_check.append(loc) 
    
for loc in ['left','right','top','bottom']:
    ax.spines[loc].set_visible(False)
ax.axes.get_xaxis().set_visible(False)
ax.axes.get_yaxis().set_visible(False)

### titles and tags

fig.text(0.12,0.77,'NGAs on Worldmap' ,**{'font':'serif', 'size':25,'weight':'bold',}, alpha = 0.9)

fig.text(0.12,0.725,'''This visualization shows actual locations of NGA (Natural Geographic Areas).''',**{'font':'serif', 'size':14,}, alpha = 0.8)

fig.text(0.77,0.28,'© Made by bhuvanchennoju/Kaggle',{'font':'serif', 'size':8, 'weight':'bold',},alpha = 0.7)

fig.show()

In order to understand the evolution of social complexity , we need to know which NGAs have data dating back to the earliest time periods whereas which NGAs have quite recent data.

In [None]:
dft=df.groupby('NGA')['Time']
df['Start']=dft.transform('min')
df['End']=dft.transform('max')

In [None]:
plt.figure(figsize=(14,10))
plt.scatter(df['Time'],df['NGA'],marker= '*', color='blue')
plt.xlabel("Date",fontweight='bold',fontsize=12)
plt.ylabel("NGA",fontweight='bold',fontsize=12)
plt.title("NGAs vs Time Periods", fontweight='bold',fontsize=16)
plt.tight_layout()
plt.xticks(rotation=90)

1. Konya Plain is the oldest NGA dating back to 10000 B.C.
2. Southern China Hills,Oro PNG,Garo Hills and Chuuk Islands are some of the most recent NGAs dating to 2000 A.D.

Let's see the number of observations for each NGA.

In [None]:
plt.figure(figsize=(12,10))
sns.countplot(df['NGA'])
plt.title('NGA counts',fontweight='bold',fontsize=16)
plt.xlabel('NGA',fontweight='bold',fontsize=12)
plt.ylabel('Count',fontweight='bold',fontsize=12)
plt.xticks(rotation=90)
plt.show()

There's a huge difference between the number of observations of NGAs. Some have very few observations whereas some others have a lot of observations.

In [None]:
# Adding the count column to the dataset
df_grouped=df.groupby(['NGA'])['End'].count().reset_index(name="count")
df=pd.merge(df,df_grouped,on='NGA')

In [None]:
#Making a new column X which combines the name ,start time of data collection and the number of observations for each NGA
df['X']=df['NGA']+"("+df['Start'].apply(str)+","+df['count'].apply(str)+")"

In [None]:
#Let's see the trend of social complexity of the oldest NGA over time
dfkp=df[df['NGA']=='Konya Plain']
plt.figure(figsize=(16,8))
plt.title('Social Complexity trend in Konya Plain',fontweight='bold',fontsize=16)
plt.xlabel('Time',fontweight='bold',fontsize=12)
plt.ylabel('SPC1',fontweight='bold',fontsize=12)
plt.plot(dfkp['Time'],dfkp['SPC1'],color='red')
plt.show()

As we can see, social-complexity increases over time , with a few ups and downs.

**10 world regions**

**Africa** - Ghanaian Coast,Niger Inland Delta,Upper Egypt 

**Europe**- Iceland, Paris Basin, Latium 

**Central Eurasia**- Lena River Valley, Orkhon Valley, Sogdiana 

**Southwest Asia** -Yemeni Coastal Plain,Konya Plain, Susiana 

**South Asia**- Garo Hills, Deccan, Kachi Plain

**Southeast Asia**- Kapuasi Basin, Central Java, Cambodian Basin 

**East Asia**- Southern China Hills,Kansai ,Middle Yellow River Valley

**North America**- Finger Lakes , Cahokia , Valley of Oaxaca 

**South America**- Lowland Andes, North Colombia , Cuzco 

**OceaniaAustralia**-OroPNG, Chuuk Islands, Big Island Hawaii 

In [None]:
df_europe=df[(df['NGA']=="Iceland") | (df['NGA']=="Paris Basin") | (df['NGA']=="Latium")]
df_eurasia=df[(df['NGA']=="Lena River Valley") | (df['NGA']=="Orkhon Valley") | (df['NGA']=="Sogdiana")]
df_na=df[(df['NGA']=="Finger Lakes") | (df['NGA']=="Cahokia") | (df['NGA']=="Valley of Oaxaca")]
df_sa=df[(df['NGA']=="Lowland Andes") | (df['NGA']=="North Colombia") | (df['NGA']=="Cuzco")]

Let's plot the SPC1 values over time for Europe, Central Eurasia, North America and South America

In [None]:
plt.figure(figsize=(20,15))
#Europe
plt.subplot(2,2,1)
plt.title('Europe',fontweight='bold')
plt.xlabel('Time',fontweight='bold')
plt.ylabel('SPC1',fontweight='bold')
plt.plot('Time','SPC1',data=df_europe[df_europe['NGA']=="Iceland"])
plt.plot('Time','SPC1',data=df_europe[df_europe['NGA']=="Paris Basin"])
plt.plot('Time', 'SPC1',data=df_europe[df_europe['NGA']=="Latium"])
plt.legend(['Iceland','Paris Basin','Latium'])
plt.show
#Central Eurasia
plt.subplot(2,2,2)
plt.xlabel('Time',fontweight='bold')
plt.ylabel('SPC1',fontweight='bold')
plt.title('Central Eurasia',fontweight='bold')
plt.plot('Time','SPC1',data=df_eurasia[df_eurasia['NGA']=="Lena River Valley"])
plt.plot('Time','SPC1',data=df_eurasia[df_eurasia['NGA']=="Orkhon Valley"])
plt.plot('Time', 'SPC1',data=df_eurasia[df_eurasia['NGA']=="Sogdiana"])
plt.legend(['Lena River Valley','Orkhon Valley','Sogdiana'])
plt.show

#North America
plt.subplot(2,2,3)
plt.title('North America',fontweight='bold')
plt.xlabel('Time',fontweight='bold')
plt.ylabel('SPC1',fontweight='bold')
plt.xlim(-3000,2000)
plt.ylim(0,1)
plt.plot('Time','SPC1',data=df_na[df_na['NGA']=="Finger Lakes"])
plt.plot('Time','SPC1',data=df_na[df_na['NGA']=="Cahokia"])
plt.plot('Time', 'SPC1',data=df_na[df_na['NGA']=="Valley of Oaxaca"])
plt.legend(['Finger Lakes','Cahokia','Valley of Oaxaca'])
plt.show

#South America
plt.subplot(2,2,4)
plt.title('South America',fontweight='bold')
plt.xlabel('Time',fontweight='bold')
plt.ylabel('SPC1',fontweight='bold')
plt.xlim(-3000,2000)
plt.ylim(0,1)
plt.plot('Time','SPC1',data=df_sa[df_sa['NGA']=="Lowland Andes"])
plt.plot('Time','SPC1',data=df_sa[df_sa['NGA']=="North Colombia"])
plt.plot('Time', 'SPC1',data=df_sa[df_sa['NGA']=="Cuzco"])
plt.legend(['Lowland Andes','North Colombia','Cuzco'])
plt.show()

As we can see from the above plots, social complexities in civilizations has evolved over time . However , the timing of take-offs of civilizations differ, with societies in Americas taking off quite later than those of Eurasia and Europe. There's also a major difference between the rate of change and level of social complexity reached by 1900s(colonial times). It can be clearly seen that more complex societies of Americas emerged later than those of Europe and Eurasia.The difference in the values indicate that societies in America were not as complex as those of Eurasia at the time they came into contact with each other. This could be one of the contributing factors to the invasion and colonization of Americas by European countries.

From these curves, we can also see that maximum values of SPC1 and the values of SPC1 at the end time(time till which data has been gathered/time around which the civilization ends) are almost same. So we can use the maximum values of SPC1 to compare the social complexity across NGAs.

In [None]:
sc=df.groupby(["X"])["SPC1"].max()
plt.figure(figsize=(16,8))
plt.title('SPC1 values for different NGAs',fontweight='bold',fontsize=16)
plt.xlabel('NGA',fontweight='bold',fontsize=12)
plt.ylabel('SPC1',fontweight='bold',fontsize=12)
plt.plot(sc,color='green');
plt.xticks(rotation=90)
plt.show()

1. NGAs which took off earlier mostly have high social complexities<br>
2. The ones with very low social complexity mean are the ones with very less observations except Cahokia which has sufficient number of observations and has data dating back to 600 B.C. Most of these are recent and the few observations that have been made show low social complexity.

Thus we can conclude that :-<br>
1. SPC1 which is a measure of social complexity is related to other factors like government, levels of hierarchy(political, religious etc.), ppopulation, money, infrastructure etc.
2. Some of the very recent NGAs have very few observations.
3. Social complexity evolves over time in all the civilizations and societies.
4. Most of the civilizations which took off earlier have higher social complexities.
5. Higher social complexities of Eurasia and Europe as compared to Americas at the time of contact can be a considered as a contributing factor to the colonization of Americas.
6. All the variables have a good correlation with each other(>0.5).


# REFERENCES

1. https://www.pnas.org/content/115/2/E144.full
2. https://escholarship.org/uc/item/4836f93g
3. http://www.harveywhitehouse.com/events/2018/5/9/future-directions-on-the-evolution-of-rituals-beliefs-and-religious-minds

# Thanks for reading the notebook 