# **World Development Indicators Analysis**

## **Introduction**

The following project is meant to analyze a wide range of socioeconomic, demographic, environmental, and development-related statistics for countries around the world. The dataset covers various aspects of development, including economic growth, poverty, education, health, environment, governance, and more. Analyzing the extensive dataset compiled by the World Bank involves examining a wide range of statistics to gain insights into global trends, regional disparities, and the progress of countries in terms of development. This static dataset covers 1960 - 2015.

The dataset contains the following file formats: CSV and SQLITE File but the analysis will only use the SQLITE File
### **Data source**

To access and organize the dataset i.e. World Development Indicators dataset run the "get_dataset.py" file.

In [1]:
# Libraries

import sqlite3
import pandas as pd

In [3]:
# Creates a connection to the World Development Indicators database.

wdi_db_path = r"C:\Users\pc\Documents\Python\W.D.I.-Analysis\W.D.I. Dataset\W.D.I. Archive\indicators.sqlite"
db_connection = sqlite3.connect(wdi_db_path)
cursor = db_connection.cursor()

def connect_db():
    
    try: # Test database connection.
        cursor.execute("SELECT * FROM sqlite_master WHERE type='table';")
        global all_tables
        all_tables = cursor.fetchall()
        print("Database connection successful...")
    except sqlite3.Error as error:
        print('Error occurred - ', error)

connect_db()
# Remember to close the connection

Database connection successful...


In [4]:
# A table representing the aggregate of rows and columns in the database.

def db_totals():
    print(f"Table\t\t\t Total Rows \t\t\tTotal Columns\n{'-' * 75}")
    for tables in all_tables:
        table_name = tables[1]
        total_rows_query = f"SELECT COUNT(*) from {table_name};"
        cursor.execute(total_rows_query)
        total_rows = cursor.fetchone()[0]
        cursor.execute(f"PRAGMA table_info({table_name})")
        columns = cursor.fetchall()
        num_columns = len(columns)
        print(f"{table_name:<20} {total_rows:>20} {num_columns:>20}")

db_totals()


Table			 Total Rows 			Total Columns
---------------------------------------------------------------------------
Country                               247                   31
CountryNotes                         4857                    3
Series                               1345                   20
Indicators                        5656458                    6
SeriesNotes                           369                    3
Footnotes                          532415                    4


In [5]:
# Queries the selected tables instructed by user input from the db for the specified columns to display the column information of the table.
selected_tables = ["Country", "CountryNotes", "Indicators", "Series"]
tables_columns = [
        {
            "Country": ["CountryCode", "ShortName", "Region", "IncomeGroup"],
            "CountryNotes": ["Countrycode", "Seriescode", "Description"],
            "Indicators": ["CountryName", "CountryCode", "IndicatorName", "IndicatorCode", "Year", "Value"],
            "Series": ["SeriesCode", "Topic", "IndicatorName", "LongDefinition"]
        }
    ]
print(selected_tables, "\n")

def view_relevant_data():
    table_name = input("Table name: ").strip()
    
    if table_name in selected_tables:
        cursor.execute(f"PRAGMA table_info({table_name});")
        result = cursor.fetchall()

        print(f"{table_name} Table")
        for columns in result:
            if columns[1] in tables_columns[0][table_name]:
                print(columns)

view_relevant_data()


['Country', 'CountryNotes', 'Indicators', 'Series'] 

Indicators Table
(0, 'CountryName', 'TEXT', 0, None, 0)
(1, 'CountryCode', 'TEXT', 0, None, 0)
(2, 'IndicatorName', 'TEXT', 0, None, 0)
(3, 'IndicatorCode', 'TEXT', 0, None, 0)
(4, 'Year', 'INTEGER', 0, None, 0)
(5, 'Value', 'NUMERIC', 0, None, 0)


In [12]:
# View a random sample from the selected database table.

def display_random_sample():
    """
    Displays a random sample of 10 columns in the selected table in the database.
    """
    try:
        table_name = input("Table name: ").strip()

        available_columns = tables_columns[0][table_name]
        query = f"SELECT {', '.join(available_columns)} FROM {table_name};"
        df = pd.read_sql_query(query, db_connection)
        random_sample = df.sample(n=10) # Random sample of the DataFrame
        return random_sample
    except KeyError:
        print("There is no such table in the database.")

        
display_random_sample()


Unnamed: 0,SeriesCode,Topic,IndicatorName,LongDefinition
1067,TM.TAX.TCOM.BR.ZS,Private Sector & Trade: Tariffs,"Bound rate, simple mean, primary products (%)",Simple mean bound rate is the unweighted avera...
983,IC.FRM.CORR.ZS,Private Sector & Trade: Business environment,Informal payments to public officials (% of fi...,Informal payments to public officials are the ...
985,IC.TAX.LABR.CP.ZS,Private Sector & Trade: Business environment,Labor tax and contributions (% of commercial p...,Labor tax and contributions is the amount of t...
770,FM.LBL.BMNY.CN,Financial Sector: Monetary holdings (liabilities),Broad money (current LCU),Broad money (IFS line 35L..ZK) is the sum of c...
520,SE.SEC.PROG.MA.ZS,Education: Efficiency,"Progression to secondary school, male (%)",Progression to secondary school refers to the ...
126,DT.DOD.PROP.CD,Economic Policy & Debt: External debt: Debt ou...,"PPG, other private creditors (DOD, current US$)",Public and publicly guaranteed other private c...
386,NV.MNF.TXTL.ZS.UN,Economic Policy & Debt: National accounts: Sha...,Textiles and clothing (% of value added in man...,Value added in manufacturing is the sum of gro...
216,DT.NFL.DPPG.CD,Economic Policy & Debt: External debt: Net flows,"Net flows on external debt, public and publicl...",Public and publicly guaranteed long-term debt ...
626,ER.PTD.TOTL.ZS,Environment: Biodiversity & protected areas,Terrestrial and marine protected areas (% of t...,Terrestrial protected areas are totally or par...
346,NE.GDI.STKB.CN,Economic Policy & Debt: National accounts: Loc...,Changes in inventories (current LCU),Inventories are stocks of goods held by firms ...
