# Duplicate Data

A data set might have duplicate data: in other words, the same record is represented multiple times. Sometimes, it's easy to find and eliminate duplicate data like when two records are exactly the same. At other times, like what was discussed in the video, duplicate data is hard to spot. 

# Exercise 1

From the World Bank GDP data, count the number of countries that have had a project totalamt greater than 1 billion dollars (1,000,000,000). To get the count, you'll have to remove duplicate data rows.

In [8]:
import pandas as pd

# read in the projects data set and do some basic wrangling 
projects = pd.read_csv('../data/projects_data.csv', dtype=str)
projects.drop('Unnamed: 56', axis=1, inplace=True)
projects['totalamt'] = pd.to_numeric(projects['totalamt'].str.replace(',', ''))
projects['countryname'] = projects['countryname'].str.split(';', expand=True)[0]
projects['boardapprovaldate'] = pd.to_datetime(projects['boardapprovaldate'])

# TODO: filter the data frame for projects over 1 billion dollars
# TODO: count the number of unique countries in the results
projects[projects["totalamt"] > 1e9]["countryname"].nunique() # .unique().shape[0]

17

# Exercise 2 (challenge)

This exercise is more challenging. The projects data set contains data about Yugoslavia, which was an Eastern European country until 1992. Yugoslavia eventually broke up into 7 countries: Bosnia and Herzegovina, Croatia, Kosovo, Macedonia, Montenegro, Serbia, and Slovenia.

But the projects dataset has some ambiguity in how it treats Yugoslavia and the 7 countries that came from Yugoslavia. Your task is to find Yugoslavia projects that are probably represented multiple times in the data set.

In [11]:
# TODO: output all projects for the 'Socialist Federal Republic of Yugoslavia'
# HINT: You can use the exact country name or use the pandas str.contains() method to search for Yugoslavia
projects[projects["countryname"].str.contains('Socialist Federal Republic of Yugoslavia')]

Unnamed: 0,id,regionname,countryname,prodline,lendinginstr,lendinginstrtype,envassesmentcategorycode,supplementprojectflg,productlinetype,projectstatusdisplay,status,project_name,boardapprovaldate,board_approval_month,closingdate,lendprojectcost,ibrdcommamt,idacommamt,totalamt,grantamt,borrower,impagency,url,projectdoc,majorsector_percent,sector1,sector2,sector3,sector4,sector5,sector,mjsector1,mjsector2,mjsector3,mjsector4,mjsector5,mjsector,theme1,theme2,theme3,theme4,theme5,theme,goal,financier,mjtheme1name,mjtheme2name,mjtheme3name,mjtheme4name,mjtheme5name,location,GeoLocID,GeoLocName,Latitude,Longitude,Country
11166,P009285,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,A,N,L,Closed,Closed,Kolubara B Thermal Power & Lignite Mine Project,1991-06-25 00:00:00+00:00,June,1997-06-30T00:00:00Z,300000000,300000000,0,300000000,0,EPS ROA (ZEP),EPS,http://projects.worldbank.org/P009285/kolubara...,,,Power!$!44!$!LD,Renewable Energy Biomass!$!56!$!LB,,,,Power;Power;Renewable Energy Biomass,,,,,,Energy and Extractives;Energy and Extractives;...,Rural services and infrastructure!$!33!$!78,Infrastructure services for private sector dev...,Urban services and housing for the poor!$!33!$!71,,,,Corporate Advocacy Priorities;Corporate Advoca...,,,,,,,,,,,,
11410,P009231,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Investment and Maintenance Loan,IN,B,N,L,Closed,Closed,Highway Sector Loan Project (03),1990-06-20 00:00:00+00:00,June,1994-12-31T00:00:00Z,292000000,292000000,0,292000000,0,ALL SIX ROAD ORGANIZATIONS,FARP,http://projects.worldbank.org/P009231/highway-...,,,Roads and highways!$!100!$!TA,,,,,Roads and highways;Roads and highways,,,,,,Transportation;Transportation,Export development and competitiveness!$!25!$!45,Pollution management and environmental health!...,Infrastructure services for private sector dev...,,,,Global Public Goods Priorities|Corporate Advoc...,,,,,,,,,,,,
11479,P009219,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Structural Adjustment Loan,AD,C,N,L,Closed,Closed,Structural Adjustment Loan Project (02),1990-04-12 00:00:00+00:00,April,1991-09-30T00:00:00Z,400000000,400000000,0,400000000,0,NBY,NBY,http://projects.worldbank.org/P009219/structur...,,,Other Industry; Trade and Services!$!38!$!YZ,Banking Institutions!$!36!$!FA,Central Government (Central Agencies)!$!12!$!BC,Other industry!$!7!$!YW,Other social services!$!7!$!JB,Other Industry; Trade and Services;Other Indus...,,,,,,Industry; Trade and Services;Industry; Trade a...,Macroeconomic management!$!33!$!23,International financial standards and systems!...,Other public sector governance!$!17!$!30,Trade facilitation and market access!$!16!$!49,Other economic management!$!17!$!24,,Corporate Advocacy Priorities;Corporate Advoca...,,,,,,,,,,,,
11694,P009225,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Investment and Maintenance Loan,IN,C,N,L,Closed,Closed,Railway Project (07),1989-05-23 00:00:00+00:00,May,1992-12-31T00:00:00Z,138000000,138000000,0,138000000,0,RTO BELGRADE; RE LJUBLJANA; RTO NOVI SAD,BORROWER,http://projects.worldbank.org/P009225/railway-...,,,Railways!$!100!$!TW,,,,,Railways;Railways,,,,,,Transportation;Transportation,!$!0,,,,,,,,,,,,,,,,,,
11695,P009275,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,,N,L,Closed,Closed,Istria Water Supply & Sewerage Project,1989-05-23 00:00:00+00:00,May,1997-12-31T00:00:00Z,60000000,60000000,0,60000000,0,RIZANA WW; ISTRIAN WW & PULA WW,BUTONIGA WW AND RIZANA WW,http://projects.worldbank.org/P009275/istria-w...,,,(Historic)Urban water supply!$!100!$!WU,,,,,(Historic)Urban water supply;(Historic)Urban w...,,,,,,Water; Sanitation and Waste Management;Water; ...,!$!0,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17903,P009137,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,,N,L,Closed,Closed,Electric Power Project,1962-07-11 00:00:00+00:00,July,1968-12-31T00:00:00Z,30000000,30000000,0,30000000,0,,,http://projects.worldbank.org/P009137/electric...,,,(Historic)Hydro!$!100!$!PH,,,,,(Historic)Hydro;(Historic)Hydro,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
17963,P009136,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,,N,L,Closed,Closed,Electric Power Project,1961-02-23 00:00:00+00:00,February,1968-01-31T00:00:00Z,30000000,30000000,0,30000000,0,,,http://projects.worldbank.org/P009136/electric...,,,(Historic)Hydro!$!100!$!PH,,,,,(Historic)Hydro;(Historic)Hydro,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
18175,P009135,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Structural Adjustment Loan,AD,,N,L,Closed,Closed,Power Mining Industry Project,1953-02-11 00:00:00+00:00,February,1957-12-31T00:00:00Z,30000000,30000000,0,30000000,0,,,http://projects.worldbank.org/P009135/power-mi...,,,(Historic)Economic management!$!100!$!ME,,,,,(Historic)Economic management;(Historic)Econom...,,,,,,(Historic)Multisector;(Historic)Multisector,!$!0,,,,,,,,,,,,,,,,,,
18197,P009134,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Structural Adjustment Loan,AD,,N,L,Closed,Closed,Power Mining Industry Project,1951-10-11 00:00:00+00:00,October,1957-12-31T00:00:00Z,28000000,28000000,0,28000000,0,,,http://projects.worldbank.org/P009134/power-mi...,,,(Historic)Economic management!$!100!$!ME,,,,,(Historic)Economic management;(Historic)Econom...,,,,,,(Historic)Multisector;(Historic)Multisector,!$!0,,,,,,,,,,,,,,,,,,


Yugoslavia officially ended on [April 27th, 1992](https://en.wikipedia.org/wiki/Yugoslavia). 

In the code cell below, filter for projects with a 'boardapprovaldate' prior to April 27th, 1992 **and** with 'countryname' Bosnia and Herzegovina, Croatia, Kosovo, Macedonia, Serbia **or** Slovenia. You'll see there are a total of 12 projects in the data set that match this criteria. Save the results in the republics variable

In [33]:
import datetime
import pytz

utc=pytz.UTC

# all 3 options work
date = utc.localize(pd.to_datetime('1992/4/27'))
# date = datetime.datetime(1992, 4, 27, tzinfo=utc)

# date = "04/27/1992"

# TODO: filter the projects data set for project boardapprovaldate prior to April 27th, 1992 AND with countryname
#  of either 'Bosnia and Herzegovina', 'Croatia', 'Kosovo', 'Macedonia', 'Serbia', or 'Sovenia'. Store the
#  results in the republics variable
#
#  TODO: so that it's easier to see all the data, keep only these columns:
# ['regionname', 'countryname', 'lendinginstr', 'totalamt', 'boardapprovaldate',
# 'location','GeoLocID', 'GeoLocName', 'Latitude','Longitude','Country', 'project_name']

# TODO: sort the results by boardapprovaldate

countrylist = ['Bosnia and Herzegovina', 'Croatia', 'Kosovo', 'Macedonia', 'Serbia', 'Sovenia']

republics = projects[
        (projects["boardapprovaldate"] < date)
        &
        (projects["countryname"].apply(lambda c: c in countrylist))
         ]

# show the results
republics

Unnamed: 0,id,regionname,countryname,prodline,lendinginstr,lendinginstrtype,envassesmentcategorycode,supplementprojectflg,productlinetype,projectstatusdisplay,status,project_name,boardapprovaldate,board_approval_month,closingdate,lendprojectcost,ibrdcommamt,idacommamt,totalamt,grantamt,borrower,impagency,url,projectdoc,majorsector_percent,sector1,sector2,sector3,sector4,sector5,sector,mjsector1,mjsector2,mjsector3,mjsector4,mjsector5,mjsector,theme1,theme2,theme3,theme4,theme5,theme,goal,financier,mjtheme1name,mjtheme2name,mjtheme3name,mjtheme4name,mjtheme5name,location,GeoLocID,GeoLocName,Latitude,Longitude,Country
12063,P039261,Europe and Central Asia,Bosnia and Herzegovina,PE,Sector Investment and Maintenance Loan,IN,C,Y,L,Closed,Closed,HIGHWAY SECTOR II,1987-10-13 00:00:00+00:00,October,,0,0,0,0,0,ROAD ORG. OF KOS;MAC;MON;VODJ;CRO .,ABOVE ORGANIZATION ...,http://projects.worldbank.org/P039261/highway-...,,,(Historic)Highways!$!100!$!TH,,,,,(Historic)Highways;(Historic)Highways,,,,,,Transportation;Transportation,!$!0,,,,,,,,,,,,,,,,,,
13048,P038998,Europe and Central Asia,Bosnia and Herzegovina,PE,Specific Investment Loan,IN,C,Y,L,Closed,Closed,POWER TRANS.III,1983-07-26 00:00:00+00:00,July,1993-12-31T00:00:00Z,0,0,0,0,0,,,http://projects.worldbank.org/P038998/power-tr...,,,(Historic)Distribution and transmission!$!100!...,,,,,(Historic)Distribution and transmission;(Histo...,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
13050,P039000,Europe and Central Asia,Macedonia,PE,Specific Investment Loan,IN,C,Y,L,Closed,Closed,POWER TRANS.III,1983-07-26 00:00:00+00:00,July,1993-12-31T00:00:00Z,0,0,0,0,0,,,http://projects.worldbank.org/P039000/power-tr...,,,(Historic)Distribution and transmission!$!100!...,,,,,(Historic)Distribution and transmission;(Histo...,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
13973,P009174,Europe and Central Asia,Macedonia,PE,Specific Investment Loan,IN,C,N,L,Closed,Closed,Agriculture & Agroindustry 2 Project (Macedonia),1980-02-01 00:00:00+00:00,February,1983-06-30T00:00:00Z,24000000,24000000,0,24000000,0,,,http://projects.worldbank.org/P009174/agricult...,,,(Historic)Agricultural credit!$!100!$!AC,,,,,(Historic)Agricultural credit;(Historic)Agricu...,,,,,,Agriculture; Fishing and Forestry;Agriculture;...,!$!0,,,,,,,IBRD13710;IBRD13710,,,,,,,,,,,


Are these projects also represented in the data labeled Yugoslavia? In the code cell below, filter for Yugoslavia projects approved between February 1st, 1980 and May 23rd, 1989 which are the minimum and maximum dates in the results above. Store the results in the yugoslavia variable.

The goal is to see if there are any projects represented more than once in the data set.

In [39]:
projects[projects["countryname"] == "Yugoslavia"]

Unnamed: 0,id,regionname,countryname,prodline,lendinginstr,lendinginstrtype,envassesmentcategorycode,supplementprojectflg,productlinetype,projectstatusdisplay,status,project_name,boardapprovaldate,board_approval_month,closingdate,lendprojectcost,ibrdcommamt,idacommamt,totalamt,grantamt,borrower,impagency,url,projectdoc,majorsector_percent,sector1,sector2,sector3,sector4,sector5,sector,mjsector1,mjsector2,mjsector3,mjsector4,mjsector5,mjsector,theme1,theme2,theme3,theme4,theme5,theme,goal,financier,mjtheme1name,mjtheme2name,mjtheme3name,mjtheme4name,mjtheme5name,location,GeoLocID,GeoLocName,Latitude,Longitude,Country


In [56]:
# TODO: Filter the projects data for Yugoslavia projects between
# February 1st, 1980 and May 23rd, 1989. Store the results in the
# Yugoslavia variable. Keep the same columns as the previous code cell.
# Sort the values by boardapprovaldate

date_start = utc.localize(pd.to_datetime('1980/2/1'))
date_end = utc.localize(pd.to_datetime('1989/5/23'))

yugoslavia = projects[
    (projects["countryname"].str.contains("Yugoslavia"))
    &
    (projects["boardapprovaldate"] >= date_start)
    &
    (projects["boardapprovaldate"] <= date_end)
]

# show the results
yugoslavia

Unnamed: 0,id,regionname,countryname,prodline,lendinginstr,lendinginstrtype,envassesmentcategorycode,supplementprojectflg,productlinetype,projectstatusdisplay,status,project_name,boardapprovaldate,board_approval_month,closingdate,lendprojectcost,ibrdcommamt,idacommamt,totalamt,grantamt,borrower,impagency,url,projectdoc,majorsector_percent,sector1,sector2,sector3,sector4,sector5,sector,mjsector1,mjsector2,mjsector3,mjsector4,mjsector5,mjsector,theme1,theme2,theme3,theme4,theme5,theme,goal,financier,mjtheme1name,mjtheme2name,mjtheme3name,mjtheme4name,mjtheme5name,location,GeoLocID,GeoLocName,Latitude,Longitude,Country
11694,P009225,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Investment and Maintenance Loan,IN,C,N,L,Closed,Closed,Railway Project (07),1989-05-23 00:00:00+00:00,May,1992-12-31T00:00:00Z,138000000,138000000,0,138000000,0,RTO BELGRADE; RE LJUBLJANA; RTO NOVI SAD,BORROWER,http://projects.worldbank.org/P009225/railway-...,,,Railways!$!100!$!TW,,,,,Railways;Railways,,,,,,Transportation;Transportation,!$!0,,,,,,,,,,,,,,,,,,
11695,P009275,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,,N,L,Closed,Closed,Istria Water Supply & Sewerage Project,1989-05-23 00:00:00+00:00,May,1997-12-31T00:00:00Z,60000000,60000000,0,60000000,0,RIZANA WW; ISTRIAN WW & PULA WW,BUTONIGA WW AND RIZANA WW,http://projects.worldbank.org/P009275/istria-w...,,,(Historic)Urban water supply!$!100!$!WU,,,,,(Historic)Urban water supply;(Historic)Urban w...,,,,,,Water; Sanitation and Waste Management;Water; ...,!$!0,,,,,,,,,,,,,,,,,,
11866,P009242,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Financial Intermediary Loan,IN,C,N,L,Closed,Closed,Export Oriented Industries Project,1988-06-29 00:00:00+00:00,June,1994-06-30T00:00:00Z,120000000,120000000,0,120000000,0,LOCAL COMMERCIAL BANKS,BORROWERS,http://projects.worldbank.org/P009242/export-o...,,,(Historic)Other finance!$!100!$!FY,,,,,(Historic)Other finance;(Historic)Other finance,,,,,,Financial Sector;Financial Sector,!$!0,,,,,,,,,,,,,,,,,,
12060,P009265,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Investment and Maintenance Loan,IN,C,N,L,Closed,Closed,Highway Sector Project (02),1987-10-13 00:00:00+00:00,October,1992-12-31T00:00:00Z,68000000,68000000,0,68000000,0,ROAD ORG. OF KOS;MAC;MON;VODJ;CRO .,ABOVE ORGANIZATION,http://projects.worldbank.org/P009265/highway-...,,,(Historic)Highways!$!100!$!TH,,,,,(Historic)Highways;(Historic)Highways,,,,,,Transportation;Transportation,!$!0,,,,,,,,,,,,,,,,,,
12225,P009217,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Financial Intermediary Loan,IN,C,N,L,Closed,Closed,Energy Conservation & Substitution Project,1987-03-31 00:00:00+00:00,March,1994-06-30T00:00:00Z,90000000,90000000,0,90000000,0,LJUBLJANSKA BANKA LJUBLJANA (LBL),BORROWERS,http://projects.worldbank.org/P009217/energy-c...,,,(Historic)Other finance!$!100!$!FY,,,,,(Historic)Other finance;(Historic)Other finance,,,,,,Financial Sector;Financial Sector,!$!0,,,,,,,,,,,,,,,,,,
12227,P038996,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Financial Intermediary Loan,IN,C,Y,L,Closed,Closed,IND.ENERGY EFFIC. I,1987-03-31 00:00:00+00:00,March,,0,0,0,0,0,LJUBLJANSKA BANKA LJUBLJANA (LBL),BORROWERS,http://projects.worldbank.org/P038996/indenerg...,,,(Historic)Other finance!$!100!$!FY,,,,,(Historic)Other finance;(Historic)Other finance,,,,,,Financial Sector;Financial Sector,!$!0,,,,,,,,,,,,,,,,,,
12375,P009215,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Investment and Maintenance Loan,IN,C,N,L,Closed,Closed,Highway Sector Project,1986-06-10 00:00:00+00:00,June,1991-12-31T00:00:00Z,121500000,121500000,0,121500000,0,ROADS ORGANIZATIONS & PARTICIPATING REP.,ROADS ORGANIZATIONS & PARTICIPATING REPUBLICS.,http://projects.worldbank.org/P009215/highway-...,,,(Historic)Highways!$!100!$!TH,,,,,(Historic)Highways;(Historic)Highways,,,,,,Transportation;Transportation,!$!0,,,,,,,,,,,,,,,,,,
12577,P009212,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,C,N,L,Closed,Closed,Petroleum Sector Project,1985-06-27 00:00:00+00:00,June,1992-12-31T00:00:00Z,92500000,92500000,0,92500000,0,INA NAFTAPLIN; NAFTA-GAS & P.B.S.,NAFTAPLIN; NAFTA-GAS & EXPL,http://projects.worldbank.org/P009212/petroleu...,,,(Historic)Oil and gas exploration and developm...,,,,,(Historic)Oil and gas exploration and developm...,,,,,,(Historic)Oil & Gas;(Historic)Oil & Gas,!$!0,,,,,,,,,,,,,,,,,,
12617,P009211,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,,N,L,Closed,Closed,Bosnia Herzegovina Forestry Project,1985-06-06 00:00:00+00:00,June,1990-12-31T00:00:00Z,35000000,35000000,0,35000000,0,PRIVREDNA BANKA SARAJEVO,SELECTED FORESTRY; BOALS ASSOCIATED IN SIPAD &...,http://projects.worldbank.org/P009211/bosnia-h...,,,Forestry!$!100!$!AT,,,,,Forestry;Forestry,,,,,,Agriculture; Fishing and Forestry;Agriculture;...,!$!0,,,,,,,,,,,,,,,,,,
12671,P009205,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Sector Adjustment Loan,AD,,N,L,Closed,Closed,Fertilizer Sector Loan Project,1985-05-03 00:00:00+00:00,May,1988-07-31T00:00:00Z,90000000,90000000,0,90000000,0,,,http://projects.worldbank.org/P009205/fertiliz...,,,(Historic)Agriculture adjustment!$!100!$!AA,,,,,(Historic)Agriculture adjustment;(Historic)Agr...,,,,,,Agriculture; Fishing and Forestry;Agriculture;...,!$!0,,,,,,,,,,,,,,,,,,


And as a final step, try to see if there are any projects in the republics variable and yugoslavia variable that could be the same project.

There are multiple ways to do that. As a suggestion, find unique dates in the republics variable. Then separately find unique dates in the yugoslavia variable. Concatenate (ie append) the results together. And then count the number of times each date occurs in this list. If a date occurs twice, that means the same boardapprovaldate appeared in both the Yugoslavia data as well as in the republics data.

You'll should find that there are three suspicious cases:

* July 26th, 1983
* March 31st, 1987
* October 13th, 1987
* May 23rd, 1989

In [75]:
import numpy as np

# TODO: find the unique dates in the republics variable
republic_unique_dates = republics["boardapprovaldate"].unique()

# TODO: find the unique dates in the yugoslavia variable
yugoslavia_unique_dates = yugoslavia["boardapprovaldate"].unique()

# TODO: make a list of the results appending one list to the other
dates = [*republic_unique_dates, *yugoslavia_unique_dates]

# TODO: print out the dates that appeared twice in the results
from collections import Counter
counter = Counter(dates)

duplicates = [t for t, c in counter.items() if c > 1]
duplicates

# *OR*
unique_dates, count = np.unique(dates, return_counts=True)

duplicates = [unique_dates[i] for i, c in enumerate(count) if c > 1]
duplicates

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1]


[Timestamp('1983-07-26 00:00:00+0000', tz='UTC'),
 Timestamp('1987-10-13 00:00:00+0000', tz='UTC')]

# Conclusion

On July 26th, 1983, for example, projects were approved for Bosnia and Herzegovina, Croatia, Macedonia, Slovenia, and Yugoslavia. The code below shows the projects for that date. You'll notice that Yugoslavia had two projects, one of which was called "Power Transmission Project (03) Energy Managem...". The projects in the other countries were all called "POWER TRANS.III". 

This looks like a case of duplicate data. What you end up doing with this knowledge would depend on the context. For example, if you wanted to get a true count for the total number of projects in the data set, should all of these projects be counted as one project? 

Run the code cell below to see the projects in question.

In [78]:
import datetime

# run this code cell to see the duplicate data
pd.concat([yugoslavia[yugoslavia['boardapprovaldate'] == datetime.datetime(1983, 7, 26, tzinfo=utc)], republics[republics['boardapprovaldate'] == datetime.datetime(1983, 7, 26, tzinfo=utc)]])

Unnamed: 0,id,regionname,countryname,prodline,lendinginstr,lendinginstrtype,envassesmentcategorycode,supplementprojectflg,productlinetype,projectstatusdisplay,status,project_name,boardapprovaldate,board_approval_month,closingdate,lendprojectcost,ibrdcommamt,idacommamt,totalamt,grantamt,borrower,impagency,url,projectdoc,majorsector_percent,sector1,sector2,sector3,sector4,sector5,sector,mjsector1,mjsector2,mjsector3,mjsector4,mjsector5,mjsector,theme1,theme2,theme3,theme4,theme5,theme,goal,financier,mjtheme1name,mjtheme2name,mjtheme3name,mjtheme4name,mjtheme5name,location,GeoLocID,GeoLocName,Latitude,Longitude,Country
13046,P009206,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Financial Intermediary Loan,IN,,N,L,Closed,Closed,Industrial Credit Project (07),1983-07-26 00:00:00+00:00,July,1989-12-31T00:00:00Z,70000000,70000000,0,70000000,0,,,http://projects.worldbank.org/P009206/industri...,,,(Historic)Financial sector development!$!100!$!FS,,,,,(Historic)Financial sector development;(Histor...,,,,,,Financial Sector;Financial Sector,!$!0,,,,,,,,,,,,,,,,,,
13047,P009208,Europe and Central Asia,Socialist Federal Republic of Yugoslavia,PE,Specific Investment Loan,IN,C,N,L,Closed,Closed,Power Transmission Project (03) Energy Managem...,1983-07-26 00:00:00+00:00,July,1993-12-31T00:00:00Z,120000000,120000000,0,120000000,0,,,http://projects.worldbank.org/P009208/power-tr...,,,(Historic)Distribution and transmission!$!100!...,,,,,(Historic)Distribution and transmission;(Histo...,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
13048,P038998,Europe and Central Asia,Bosnia and Herzegovina,PE,Specific Investment Loan,IN,C,Y,L,Closed,Closed,POWER TRANS.III,1983-07-26 00:00:00+00:00,July,1993-12-31T00:00:00Z,0,0,0,0,0,,,http://projects.worldbank.org/P038998/power-tr...,,,(Historic)Distribution and transmission!$!100!...,,,,,(Historic)Distribution and transmission;(Histo...,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
13050,P039000,Europe and Central Asia,Macedonia,PE,Specific Investment Loan,IN,C,Y,L,Closed,Closed,POWER TRANS.III,1983-07-26 00:00:00+00:00,July,1993-12-31T00:00:00Z,0,0,0,0,0,,,http://projects.worldbank.org/P039000/power-tr...,,,(Historic)Distribution and transmission!$!100!...,,,,,(Historic)Distribution and transmission;(Histo...,,,,,,(Historic)Electric Power & Other Energy;(Histo...,!$!0,,,,,,,,,,,,,,,,,,
