<h1 style="text-align: center; color: red; font-size: 25pt; padding: 5px; font-weight: bold; text-shadow: 1px 1px 1px rgb(50, 100, 100)"> Terror Attacks In Nigeria .... </h1> 
<p style="font-size: 12pt; text-align: center"> Abubakar Abdulkadir</p>

<img src="images/bg.jpg" style="width: 100%; height:350px; display: inline-block" />

# 1.0 Project Overview

```A Terror attack is the unlawful use of force or violence against persons or property to intimidate or coerce a government, the civilian population, or any segment thereof in furtherance of political or social objectives"``` - - The Federal Bereau of Intelligence, US.

Nigeria is one of such countries of the world where insecurity and terrorist activities has been on the rise. The West African nation is one of the most hazardous nations in the world, with the most frequent and deadly attacks. There have been 2470 terrorist incidents in the last five years, resulting in 17751 fatalities and 7345 injuries (WorldData.info). With the emergence of new dangerous groups like the Fulani extremists and herdsmen, these numbers continue to rise daily.

Attacks in Nigeria have the notable characteristic that each terrorist group or sect's attacks can almost immediately be linked to them due to similarities in the attack's mode, target, duration, motive, and casualty toll, among other factors. It can be argued that patterns and intelligence can be extracted from datasets of these attacks. Even if these records are accessible in public places like Wikipedia, social media platforms like Facebook and Twitter, news websites, and newspapers, to name a few, they aren't offered in formats that are simple to use with tools for data manipulation and analysis. Consequently, limiting the utilizing of data-centric approach in combating terrorism in the nation.

This furthers the project's motivation. To aid in the fight against insecurity, the project aims to collect data about attacks and security incidences in Nigeria, process the data into a widely used format, perform analysis on the dataset to identify trends and patterns, and provide answers to questions about the mode of operation, timing, motive, and other aspects of the attacks.

Although the project's main focus is on insecurity in Nigeria, it has worldwide significance because terrorism has always been a problem on a global scale. While it is true that this threat's exposure differs from region to region and country to country, the necessity of taking effective action on a global scale cannot be overstated.

# 2. 0 Data Collection

The datasets considered for this project are collected from four different sources including the Armed Conflict Location and Event Data project (ACLED), the global terrorism database (GTD), CHATGPT API and GITHUB 

### 2.1 Armed Conflict Location & Event Data Project (ACLED)

The dataset from this source includes All insecurity incidences in Nigeria from January 2009 to March 28 2023. It has 31 features which includes time and date related features, location features, event properties, casualties, information sources, and brief description of event.

The insecurity incidences covered in this dataset involves Violence against civilians, Battles, Protests, Riots, Explosions/Remote violence and Strategic developments. The dataset comes in a CSV format and can be assessed on the ACLED data download <a href='https://acleddata.com/data-export-tool/'>portal</a> 

### 2.2 Global Terrorism Database (GTD)

The Global Terrorism Database (GTD) is an open-source database including information on terrorist events around the world from 1970 through 2020 (with annual updates planned for the future). Unlike many other event databases, the GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 200,000 cases. Although, for this use case, the data collected from this database was filtered to contain only events which occured in Nigeria.

The dataset collected from this source includes 135 features. Although, it was narrowed down to 16 for this project. The dataset can be downloaded from their website at <a href="https://www.start.umd.edu/gtd/"> GTD portal</a> 

### 2.3 ChatGPT API

ChatGPT API is the Application interface through which the chatGPT model can be accessed. ChatGPT is an AI chatbot that was initially built on a family of large language models (LLMs) collectively known as GPT-3. This model has been trained on huge amount of data to understand and generate human like responses to text prompts. Although, it occassional provides incorrect responses, its ability to produce human-like, and frequently accurate, responses to a vast range of questions is why it is considered for this pproject.

This API is used to curate a list of properties for each local government area in Nigeria. Alhough, it is worthy of note that whatever response the API is providing is based of its knowledge of 2021 which is the range of dataset it has been trained on.

#### 2.3.1 Collecting Dataset from Chat GPT API

In [1]:
#importing packages
import openai
import pandas as pd
from csv import writer
import re

In [2]:
openai.api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'  # input the API key

In [4]:
df = pd.read_csv('datasets/cleaned/lgas.csv') # read csv file of all local government area in Nigeria

In [None]:
# seek details in batches of 100
for i in range(0, 800, 100):
    cur_lgas = df.iloc[i:i + 100]
    lga = list(cur_lgas['name'])
    
    #loop through each local government name in each batch
    for j in range(len(lga)):
        prompt = '''In one word each, Provide actual values for the questions  
                1. Percentage of Educated
                2. Level of Isolation
                3. Natural Barriers
                4. Elevation
                5. Population
                6. Major Ocupation
                7. Political stability
                8. Economic Situation
                9. Average Cost of living
                10. Employment rate
                11. Ethnic Marginalization
                12. Transport infrastructure
                13. Longitude 
                14. Latitude
                15. Income Equality
                16. Major ethnic group
                17. Communication network
                18. Infrastructure development
                19. Dominant Age group
                20. Average family size
                21. Average education level
                22. Average Income in Naira
                of ${}. 
                Numper your reponse and put your answer in a single line '''.format(lga[j])

        response = openai.ChatCompletion.create(
              model="gpt-3.5-turbo",
              messages=[ 
                    {"role": "user", "content": prompt},     
                ]
        )
        
        data = (response['choices'][0]['message'].content).replace('\n', ' ')

        # append the response to a file
        f_object = open('laga_dataset.txt', 'a', encoding="utf-8")
        f_object.write("*** 0" + lga[j] + " "  + " " + data + " " + "*** /n")

#### 2.3.2 Formating the dataset into CSV

In [None]:
#read the txt file
with open('laga_dataset.txt', 'r') as f_handler:
    features = f_handler.read()

# split the dataset into each local goverment with the "*** 0" marker and ignore the first entry (It is empty)
each_feature = features.split("*** 0")[1:]

# split each feature into a list and append it to another list to form a list of list
features_list = []
for feature in each_feature:
    feature = feature.replace("*** /n", "")
    str_list = re.sub(r'\d+\. ', '<break>', feature).split('<break>')
    str_list = [x.strip() for x in str_list if (len(x.strip()) > 1 or re.findall('\d+', x))]
    features_list.append(str_list)

In [None]:
# define the pandas column header names
columns = [
            "Lga Name", "percentage of educated", "Level of Isolation", "Natural Barriers", "Elevation", 
            "Population", "Major Ocupation", "Political stability", 
            "Economic Situation", "Average Cost of living", "Employment rate", 
            "Ethnic Marginalization", "Transport infrastructure", "Longitude",  
            "Latitude", "Income Equality", "Major ethnic group", "Communication network", 
            "Infrastructure development", "Dominant Age group", "Average family size",
            "Average education level", "Average Income in Naira"
]


#create a dataframe with the dataset
feature_df = pd.DataFrame(features_list, columns=columns)

### 2.4 GitHub

The local governments in Nigeria dataset was gotten from github. It contains all the 774 local government area in Nigeria. The Local goverment is the lowest level of governemnt in Nigeria. Hence, the reason why properties of places are collected in the group of local governments. The dataset was gotten from <a href='https://github.com/xosasx/nigerian-local-government-areas'> this </a> repository. 