# Reading in a text file

**Text files contain plain text data, JPEGs contain images and MP3s contain audio files. These are examples of files that can be read into Python.**

**To start, you open the text file with Python `open()` function, and you must understand file paths and how to find files in your computer. To get the data itself, you need to iterate over the text line-by-line.**

In [1]:
jabber = open('data/jabberwocky.txt', 'r')

for line in jabber:
    print(line)
    
    
jabber.close()

'Twas brillig, and the slithy toves

Did gyre and gimble in the wabe;

All mimsy were the borogoves,

And the mome raths outgrabe.



"Beware the Jabberwock, my son!

The jaws that bite, the claws that catch!

Beware the Jubjub bird, and shun

The frumious Bandersnatch!"



He took his vorpal sword in hand:

Long time the manxome foe he soughtâ€”

So rested he by the Tumtum tree,

And stood awhile in thought.



And as in uffish thought he stood,

The Jabberwock, with eyes of flame,

Came whiffling through the tulgey wood,

And burbled as it came!



One two! One two! And through and through

The vorpal blade went snicker-snack!

He left it dead, and with its head

He went galumphing back.



"And hast thou slain the Jabberwock?

Come to my arms, my beamish boy!"

"O frabjous day! Callooh! Callay!"

He chortled in his joy.



'Twas brillig, and the slithy toves

Did gyre and gimble in the wabe;

All mimsy were the borogoves,

And the mome raths outgrabe.



    â€“ Lewis Carroll



In [4]:
jabber = open('data/jabberwocky.txt', encoding='utf-8')

for line in jabber:
    print(line, end='')
    
    
jabber.close()

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One two! One two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!"
"O frabjous day! Callooh! Callay!"
He chortled in his joy.

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

    – Lewis Carroll


**The text output is automatically double-spaced lines, because it reads an end-of-line caret in the text file as a new line (`\n`). You can change this by updating the end of the print statement in the loop to suppress the automated new line. You can also use Python string methods on the text variable to clean up the data:**

    print(line.strip())
    print(line.lstrip())
    print(line.rstrip())

**Also, note that hyphens were not being encoded properly, until adding `encoding` specification.**

**NOTE: You must remember to close the file once done, otherwise there will be problems, especially with Windows. In some cases, the data is lost completely.**

## `with` statement

**The above code is not the cleanest way to read files. Python introduced the `with` statement in version 2.5 to avoid problems when dealing with external files.**

In [7]:
with open('data/jabberwocky.txt', encoding='utf-8') as jabber:
    for line in jabber:
        print(line.rstrip())

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One two! One two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!"
"O frabjous day! Callooh! Callay!"
He chortled in his joy.

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

    – Lewis Carroll


In [26]:
# Read top 5 lines to list object

with open('data/jabberwocky.txt', encoding='utf-8') as jabber:
    head = [next(jabber) for _ in range(4)]

    
print(*head)

'Twas brillig, and the slithy toves
 Did gyre and gimble in the wabe;
 All mimsy were the borogoves,
 And the mome raths outgrabe.



**The `next()` function retrieves the next line in a text object for adding to the list. The loop ends once the 5th line is reached.**

**Even though iterating over open text file is the most common way to read in text, there are other methods like `read()`, `readline()` and `readlines()`.** 

**The `readlines()` method outputs each line of the entire text as string objects in a list.**

In [20]:
with open('data/jabberwocky.txt', encoding='utf-8') as jabber:
    lines = jabber.readlines()

    
print(*lines)
print(lines)
print()
print(lines[-3])

'Twas brillig, and the slithy toves
 Did gyre and gimble in the wabe;
 All mimsy were the borogoves,
 And the mome raths outgrabe.
 
 "Beware the Jabberwock, my son!
 The jaws that bite, the claws that catch!
 Beware the Jubjub bird, and shun
 The frumious Bandersnatch!"
 
 He took his vorpal sword in hand:
 Long time the manxome foe he sought—
 So rested he by the Tumtum tree,
 And stood awhile in thought.
 
 And as in uffish thought he stood,
 The Jabberwock, with eyes of flame,
 Came whiffling through the tulgey wood,
 And burbled as it came!
 
 One two! One two! And through and through
 The vorpal blade went snicker-snack!
 He left it dead, and with its head
 He went galumphing back.
 
 "And hast thou slain the Jabberwock?
 Come to my arms, my beamish boy!"
 "O frabjous day! Callooh! Callay!"
 He chortled in his joy.
 
 'Twas brillig, and the slithy toves
 Did gyre and gimble in the wabe;
 All mimsy were the borogoves,
 And the mome raths outgrabe.
 
     – Lewis Carroll

["'Twas b

**The `read()` method returns the entire text as one single string rather than split up into lines, even though the output looks similar.**

In [21]:
with open('data/jabberwocky.txt', encoding='utf-8') as jabber:
    text = jabber.read()
    
print(text)

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One two! One two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!"
"O frabjous day! Callooh! Callay!"
He chortled in his joy.

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

    – Lewis Carroll



In [22]:
# Print the string backwards

for character in reversed(text):
    print(character, end='')


llorraC siweL –    

.ebargtuo shtar emom eht dnA
,sevogorob eht erew ysmim llA
;ebaw eht ni elbmig dna eryg diD
sevot yhtils eht dna ,gillirb sawT'

.yoj sih ni deltrohc eH
"!yallaC !hoollaC !yad suojbarf O"
"!yob hsimaeb ym ,smra ym ot emoC
?kcowrebbaJ eht nials uoht tsah dnA"

.kcab gnihpmulag tnew eH
daeh sti htiw dna ,daed ti tfel eH
!kcans-rekcins tnew edalb laprov ehT
hguorht dna hguorht dnA !owt enO !owt enO

!emac ti sa delbrub dnA
,doow yeglut eht hguorht gnilffihw emaC
,emalf fo seye htiw ,kcowrebbaJ ehT
,doots eh thguoht hsiffu ni sa dnA

.thguoht ni elihwa doots dnA
,eert mutmuT eht yb eh detser oS
—thguos eh eof emoxnam eht emit gnoL
:dnah ni drows laprov sih koot eH

"!hctansrednaB suoimurf ehT
nuhs dna ,drib bujbuJ eht eraweB
!hctac taht swalc eht ,etib taht swaj ehT
!nos ym ,kcowrebbaJ eht eraweB"

.ebargtuo shtar emom eht dnA
,sevogorob eht erew ysmim llA
;ebaw eht ni elbmig dna eryg diD
sevot yhtils eht dna ,gillirb sawT'

**The `readline()` method is not as useful as the others, but you can handle each line separately which is useful in large text files. It is similar to iterating over the text in `for` loop seen at the start.**

In [23]:
with open('data/jabberwocky.txt', encoding='utf-8') as jabber:
    while True:
        line = jabber.readline().rstrip()
        print(line)
        
        if 'bandersnatch' in line.casefold():
            break

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"


In [24]:
print(line)

The frumious Bandersnatch!"


## String methods

**You have seen `strip()` and `rstrip()` methods used to process text, but their behaviour is not obvious without some explanation, especially when you involve other string methods, like `lstrip()`.**

**In Python, whitespace characters refer to space, tab, newline, carriage returns and character feeds. When using `strip()` method on a string object, Python removes any leading or trailing whitespace characters.**

**The `lstrip()` method removes any whitespace characters from the left side of the string, e.g. tab spaces.**

**The `rstrip()` method removes any whitespace characters from the right side of the string, e.g. newline characters.**

**Use the appropriate method based on your preferred language and whether it reads from left-to-right or vice versa.**

In [1]:
filepath = 'data/jabberwocky.txt'

with open(filepath) as poem:
    first = poem.readline().rstrip()
    

print(first)

'Twas brillig, and the slithy toves


In [2]:
filepath = 'data/jabberwocky.txt'

with open(filepath) as poem:
    first = poem.readline().lstrip()
    

print(first)

'Twas brillig, and the slithy toves



In [3]:
filepath = 'data/jabberwocky.txt'

with open(filepath) as poem:
    first = poem.readline().strip()
    

print(first)

'Twas brillig, and the slithy toves


**The stripping methods check for the default whitespace or defined sequence of characters that you want removed at start or end of a string, and only if they exist at the start or end of the string are the characters removed. Once the defined characters have been checked, the function ends.**

**As you can see, all three methods output the same when reading the first line, so there appears no difference in behaviour. The initial apostrophe is not considered whitespace, so you would need to explicity define it to be removed when stripping. You should use `lstrip()` method because stripping will remove the apostrophe from both ends.**

In [5]:
chars = "' "

no_apos_first = first.lstrip(chars)

print(no_apos_first)

Twas brillig, and the slithy toves


## Parsing Data in text file

**In many cases, the data will be formatted in the text file in such a way that you need to 'parse' the data, e.g. when a line is actually a row and each field is separated by comma or `|` separator, i.e. a table with rows and columns. You can read the data into nested dictionaries, lists or sets.**

**In the text sample provided, there are seven fields for each country, containing the capital name, country and phone codes, currency and timezone. Not all countries have data for each field so there are missing values, e.g. Antarctica has no capital city and polar regions cover all timezones. Parsing the data means making sense of the data, by separating it into logical components for the Python compiler.**

    Country|Capital|CC|CC3|IAC|TimeZone|Currency
    Afghanistan|Kabul|AF|AFG|+93|UTC+04:30|Afghan afghani
    ...

In [1]:
input_file = 'data/country_info.txt'

In [7]:
# Note that list for Antarctica has seven strings, even if most are empty

with open(input_file) as countries:
    for row in countries:
        data = row.strip().split('|')
        print(data)

['Country', 'Capital', 'CC', 'CC3', 'IAC', 'TimeZone', 'Currency']
['Afghanistan', 'Kabul', 'AF', 'AFG', '+93', 'UTC+04:30', 'Afghan afghani']
['Aland Islands', 'Mariehamn', 'AX', 'ALA', '+358', 'UTC+02:00', 'Euro']
['Albania', 'Tirana', 'AL', 'ALB', '+355', 'UTC+01:00', 'Albanian lek']
['Algeria', 'Algiers', 'DZ', 'DZA', '+213', 'UTC', 'Algerian dinar']
['American Samoa', 'Pago Pago', 'AS', 'ASM', '+1 684', 'UTC-11:00', '']
['Andorra', 'Andorra la Vella', 'AD', 'AND', '+376', 'UTC+01:00', 'Euro']
['Angola', 'Luanda', 'AO', 'AGO', '+244', 'UTC+01:00', 'Angolan kwanza']
['Anguilla', 'The Valley', 'AI', 'AIA', '+1 264', 'UTC-04:00', 'East Caribbean dollar']
['Antarctica', '', 'AQ', 'ATA', '', '', '']
['Antigua and Barbuda', "St. John's", 'AG', 'ATG', '+1 268', 'UTC-04:00', 'East Caribbean dollar']
['Argentina', 'Buenos Aires', 'AR', 'ARG', '+54', 'UTC-03:00', 'Argentine peso']
['Armenia', 'Yerevan', 'AM', 'ARM', '+374', 'UTC+04:00', 'Armenian dram']
['Aruba', 'Oranjestad', 'AW', 'ABW', '

**You can store each column as a list variables, i.e. use the column names 'country', 'capital', 'ccode', 'ccode3', 'iac', 'timezone', 'currency'.**

In [8]:
with open(input_file) as countries:
    for row in countries:
        data = row.strip().split('|')
        country, capital, ccode, ccode3, iac, timezone, currency = data
        print(country, capital, ccode, ccode3, iac, timezone, currency, sep='\n\t')

Country
	Capital
	CC
	CC3
	IAC
	TimeZone
	Currency
Afghanistan
	Kabul
	AF
	AFG
	+93
	UTC+04:30
	Afghan afghani
Aland Islands
	Mariehamn
	AX
	ALA
	+358
	UTC+02:00
	Euro
Albania
	Tirana
	AL
	ALB
	+355
	UTC+01:00
	Albanian lek
Algeria
	Algiers
	DZ
	DZA
	+213
	UTC
	Algerian dinar
American Samoa
	Pago Pago
	AS
	ASM
	+1 684
	UTC-11:00
	
Andorra
	Andorra la Vella
	AD
	AND
	+376
	UTC+01:00
	Euro
Angola
	Luanda
	AO
	AGO
	+244
	UTC+01:00
	Angolan kwanza
Anguilla
	The Valley
	AI
	AIA
	+1 264
	UTC-04:00
	East Caribbean dollar
Antarctica
	
	AQ
	ATA
	
	
	
Antigua and Barbuda
	St. John's
	AG
	ATG
	+1 268
	UTC-04:00
	East Caribbean dollar
Argentina
	Buenos Aires
	AR
	ARG
	+54
	UTC-03:00
	Argentine peso
Armenia
	Yerevan
	AM
	ARM
	+374
	UTC+04:00
	Armenian dram
Aruba
	Oranjestad
	AW
	ABW
	+297
	UTC-04:00
	Aruban florin
Australia
	Canberra
	AU
	AUS
	+61
	UTC+07:00 - UTC+10:00
	Australian dollar
Austria
	Vienna
	AT
	AUT
	+43
	UTC+01:00
	Euro
Azerbaijan
	Baku
	AZ
	AZE
	+994
	UTC+04:00
	Azerbaijani manat
Ba

In [10]:
# Remove the first row by reading it in before the loop starts

with open(input_file) as countries:
    countries.readline()
    for row in countries:
        data = row.strip().split('|')
        country, capital, ccode, ccode3, iac, timezone, currency = data
        print(country, capital, ccode, ccode3, iac, timezone, currency, sep='\n\t')

Afghanistan
	Kabul
	AF
	AFG
	+93
	UTC+04:30
	Afghan afghani
Aland Islands
	Mariehamn
	AX
	ALA
	+358
	UTC+02:00
	Euro
Albania
	Tirana
	AL
	ALB
	+355
	UTC+01:00
	Albanian lek
Algeria
	Algiers
	DZ
	DZA
	+213
	UTC
	Algerian dinar
American Samoa
	Pago Pago
	AS
	ASM
	+1 684
	UTC-11:00
	
Andorra
	Andorra la Vella
	AD
	AND
	+376
	UTC+01:00
	Euro
Angola
	Luanda
	AO
	AGO
	+244
	UTC+01:00
	Angolan kwanza
Anguilla
	The Valley
	AI
	AIA
	+1 264
	UTC-04:00
	East Caribbean dollar
Antarctica
	
	AQ
	ATA
	
	
	
Antigua and Barbuda
	St. John's
	AG
	ATG
	+1 268
	UTC-04:00
	East Caribbean dollar
Argentina
	Buenos Aires
	AR
	ARG
	+54
	UTC-03:00
	Argentine peso
Armenia
	Yerevan
	AM
	ARM
	+374
	UTC+04:00
	Armenian dram
Aruba
	Oranjestad
	AW
	ABW
	+297
	UTC-04:00
	Aruban florin
Australia
	Canberra
	AU
	AUS
	+61
	UTC+07:00 - UTC+10:00
	Australian dollar
Austria
	Vienna
	AT
	AUT
	+43
	UTC+01:00
	Euro
Azerbaijan
	Baku
	AZ
	AZE
	+994
	UTC+04:00
	Azerbaijani manat
Bahamas
	Nassau
	BS
	BHS
	+1 242
	UTC-05:00
	Bahamian

**You can print each row of data as values in a dictionary, within the loop. You can see how it is formatted in a dictionary.** 

In [13]:
with open(input_file) as countries:
    countries.readline()
    for row in countries:
        data = row.strip().split('|')
        country, capital, ccode, ccode3, iac, timezone, currency = data
        country_dict = {
            'country': country, 
            'capital': capital,
            'ccode': ccode,
            'ccode3': ccode3, 
            'dial_code': iac, 
            'timezone': timezone,
            'currency': currency
        }
        print(country_dict)

{'country': 'Afghanistan', 'capital': 'Kabul', 'ccode': 'AF', 'ccode3': 'AFG', 'dial_code': '+93', 'timezone': 'UTC+04:30', 'currency': 'Afghan afghani'}
{'country': 'Aland Islands', 'capital': 'Mariehamn', 'ccode': 'AX', 'ccode3': 'ALA', 'dial_code': '+358', 'timezone': 'UTC+02:00', 'currency': 'Euro'}
{'country': 'Albania', 'capital': 'Tirana', 'ccode': 'AL', 'ccode3': 'ALB', 'dial_code': '+355', 'timezone': 'UTC+01:00', 'currency': 'Albanian lek'}
{'country': 'Algeria', 'capital': 'Algiers', 'ccode': 'DZ', 'ccode3': 'DZA', 'dial_code': '+213', 'timezone': 'UTC', 'currency': 'Algerian dinar'}
{'country': 'American Samoa', 'capital': 'Pago Pago', 'ccode': 'AS', 'ccode3': 'ASM', 'dial_code': '+1 684', 'timezone': 'UTC-11:00', 'currency': ''}
{'country': 'Andorra', 'capital': 'Andorra la Vella', 'ccode': 'AD', 'ccode3': 'AND', 'dial_code': '+376', 'timezone': 'UTC+01:00', 'currency': 'Euro'}
{'country': 'Angola', 'capital': 'Luanda', 'ccode': 'AO', 'ccode3': 'AGO', 'dial_code': '+244', 

**You can put each of these dictionaries as values in another dictionary, using country name as the key.**

In [2]:
country_info_dict = {}

with open(input_file, encoding='utf-8') as countries:
    countries.readline()
    for row in countries:
        data = row.strip().split('|')
        country, capital, ccode, ccode3, iac, timezone, currency = data
        country_dict = {
            'country': country, 
            'capital': capital,
            'ccode': ccode,
            'ccode3': ccode3, 
            'dial_code': iac, 
            'timezone': timezone,
            'currency': currency
        }
        country_info_dict[country.casefold()] = country_dict

In [3]:
print(country_info_dict)

{'afghanistan': {'country': 'Afghanistan', 'capital': 'Kabul', 'ccode': 'AF', 'ccode3': 'AFG', 'dial_code': '+93', 'timezone': 'UTC+04:30', 'currency': 'Afghan afghani'}, 'aland islands': {'country': 'Aland Islands', 'capital': 'Mariehamn', 'ccode': 'AX', 'ccode3': 'ALA', 'dial_code': '+358', 'timezone': 'UTC+02:00', 'currency': 'Euro'}, 'albania': {'country': 'Albania', 'capital': 'Tirana', 'ccode': 'AL', 'ccode3': 'ALB', 'dial_code': '+355', 'timezone': 'UTC+01:00', 'currency': 'Albanian lek'}, 'algeria': {'country': 'Algeria', 'capital': 'Algiers', 'ccode': 'DZ', 'ccode3': 'DZA', 'dial_code': '+213', 'timezone': 'UTC', 'currency': 'Algerian dinar'}, 'american samoa': {'country': 'American Samoa', 'capital': 'Pago Pago', 'ccode': 'AS', 'ccode3': 'ASM', 'dial_code': '+1 684', 'timezone': 'UTC-11:00', 'currency': ''}, 'andorra': {'country': 'Andorra', 'capital': 'Andorra la Vella', 'ccode': 'AD', 'ccode3': 'AND', 'dial_code': '+376', 'timezone': 'UTC+01:00', 'currency': 'Euro'}, 'angol

**Now you can extract and analyse data more easily, e.g. display the info for any country.**

In [53]:
print("Please enter country name")
country_name = input().casefold()
print()

while country_name:
    if country_name in country_info_dict:
        country_data = country_info_dict[country_name]
        for key, value in country_data.items():
            print(f"{key}: {value}")
    else:
        print("Country does not exist.")
    break

Please enter country name
Yemen

country: Yemen
capital: Sanaá
ccode: YE
ccode3: YEM
dial_code: +967
timezone: UTC+03:00
currency: Yemeni rial


In [8]:
# Enter 'quit' to exit program

while True:
    country_name = input("Enter country name ").casefold()
    if country_name in country_info_dict:
        country_data = country_info_dict[country_name]
        print(f"The capital of {country_data['country']} is {country_data['capital']}")
    elif country_name == 'quit':
        break
    elif country_name not in country_info_dict:
        print("Country does not exist. Try again ")
        country_name = input().casefold()
        country_data = country_info_dict[country_name]
        print(f"The capital of {country_data['country']} is {country_data['capital']}")
        print()
    

Enter country name Yeemeen
Country does not exist. Try again 
Yemen
The capital of Yemen is Sanaá

Enter country name France
The capital of France is Paris
Enter country name Italy
The capital of Italy is Rome
Enter country name quit


**You can use the dictionary in other ways, e.g. look up country names given a country code. In this case, you need to access the nested dictionary values.**

In [18]:
country_code = input("Enter country code ").upper()

for dict_values in country_info_dict.values():
    for value in dict_values.values():
        if country_code in value:
            print(dict_values)


Enter country code sa
{'country': 'Saudi Arabia', 'capital': 'Riyadh', 'ccode': 'SA', 'ccode3': 'SAU', 'dial_code': '+966', 'timezone': 'UTC+03:00', 'currency': 'Saudi riyal'}
{'country': 'Saudi Arabia', 'capital': 'Riyadh', 'ccode': 'SA', 'ccode3': 'SAU', 'dial_code': '+966', 'timezone': 'UTC+03:00', 'currency': 'Saudi riyal'}
{'country': 'United States', 'capital': 'Washington D.C.', 'ccode': 'US', 'ccode3': 'USA', 'dial_code': '+1', 'timezone': 'UTC-11:00 - UTC-05:00', 'currency': 'United States dollar'}


**List all the countries that do not have any capital cities, e.g. Antarctica. Note that two countries have no timezones (Antarctica and United States Minor Outlying Islands). This is because they are polar regions where all 24 timezones collide at a single point, rendering it meaningless.**

In [27]:
for dict_values in country_info_dict.values():
    if dict_values['capital'] == '':
        country_name = dict_values['country']
        country_timezone = dict_values['timezone']
        print(f"{country_name}\t ({country_timezone})")
        

Antarctica	 ()
Bouvet Island	 (UTC+01:00)
Heard Island and McDonald Islands	 (UTC+05)
Hong Kong	 (UTC+08:00)
Macau	 (UTC+08:00)
Reunion	 (UTC+04:00)
Svalbard and Jan Mayen	 (UTC+01:00)
United States Minor Outlying Islands	 ()


In [28]:
for dict_values in country_info_dict.values():
    if dict_values['capital'] == '':
        print(dict_values)

{'country': 'Antarctica', 'capital': '', 'ccode': 'AQ', 'ccode3': 'ATA', 'dial_code': '', 'timezone': '', 'currency': ''}
{'country': 'Bouvet Island', 'capital': '', 'ccode': 'BV', 'ccode3': 'BVT', 'dial_code': '', 'timezone': 'UTC+01:00', 'currency': ''}
{'country': 'Heard Island and McDonald Islands', 'capital': '', 'ccode': 'HM', 'ccode3': 'HMD', 'dial_code': '', 'timezone': 'UTC+05', 'currency': ''}
{'country': 'Hong Kong', 'capital': '', 'ccode': 'HK', 'ccode3': 'HKG', 'dial_code': '+852', 'timezone': 'UTC+08:00', 'currency': 'Hong Kong dollar'}
{'country': 'Macau', 'capital': '', 'ccode': 'MO', 'ccode3': 'MAC', 'dial_code': '+853', 'timezone': 'UTC+08:00', 'currency': 'Macanese pataca'}
{'country': 'Reunion', 'capital': '', 'ccode': 'RE', 'ccode3': 'REU', 'dial_code': '+262', 'timezone': 'UTC+04:00', 'currency': 'Euro'}
{'country': 'Svalbard and Jan Mayen', 'capital': '', 'ccode': 'SJ', 'ccode3': 'SJM', 'dial_code': '+47', 'timezone': 'UTC+01:00', 'currency': 'Norwegian krone'}
{