# Data Wrangling with Python

## Introduction to Data Wrangling

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. It is generally done at the very first stage of a data science/ analytics pipeline. And it involves the following

1. Data Acquisition: Identify and obtain access to the data within your sources (Web, database, files).
2. Joining Data : Combine the edited data for further use and analysis.
3. Data Cleansing: Redesign the data into a usable/functional format and correct/remove any bad data.

### Advantages of using Python for data wrangling
1. General purpose open source paradigm putting no restriction on any of the methods you can develop for the specific problem at hand
2. Great ecosystem of fast, optimized, open source libraries focused on data analytics
3. Growing support to connect Python to every conceivable data source type. And ability to write code to connect to unsupported data sources
4. Easy interface to data manipulation and visualization libraries to check data quality
5. Ability to call C libraries direct using Python as the coding language for performance.
6. Strong support for GPU that provides performance booast for data crunching

### Lists, Sets, Strings, Tuples, and Dictionaries
This section introduces the various basic data structures in Python. They provide the foundations to manipulate data in Python.

#### LISTS
Lists are fundamental Python data structures that have the following properties
- continuous memory locations
- host different data types
- accessed by the index.

In [1]:
# Examples of lists
list_example = [ 51, 27, 34, 46, 90, 45, -19 ]  # Simple list with one data types

list_example2 = [ 15, 'Yellow car', True, [ 12, "Hello" ] ] # List with different data types

> Note: Mixing different data types in a single list, can actually create subtle bugs that can be very difficult to track.

##### List Access

Syntax
- Slicing
```
a[start:stop]      # items start through stop-1
a[start:]          # items start through the rest of the array
a[:stop]           # items from the beginning through stop-1
a[:]               # a copy of the whole array
a[start:stop:step] # start through not past stop, by step
```

In [2]:
# Define a list called list_1 with four integer members, using the following command:
list_1 = [ 34, 12, 89, 1 ]

| List index visualization       | 34 | 12 | 89 | 1  |
|------------------:|:--:|:--:|:--:|:--:|
| Indices (Forward) | 0  | 1  | 2  | 3  |
| Indices (Backward)| -4 | -3 | -2 | -1 |

In [3]:
# Access the first element from list
list_1[0]

34

In [4]:
# Access the last element from list
list_1[3]

1

In [5]:
# Access the last element from list using len function
list_1[len(list_1) - 1]

1

In [6]:
# Access the last element from list using backward index
list_1[-1]

1

In [7]:
# Access the middle 2 items using forward indices. Format 
list_1[1:3]

[12, 89]

> Note: `list[start_index_of_slice, end_index - 1]`
`start_index_of_slice` - index to start slice. In the example it is the memory location containing `12`
`end_index - 1` - index to end slice. Since it is the index `3 - 1`, in the example it is the memory location containing `89`

In [8]:
# Access the last two elements using backward index
list_1[-2:]

[89, 1]

In [9]:
# Access the first two elements using backward index
list_1[:-2]

[34, 12]

In [10]:
# reverse the elements in list. This is not recommended due to the difficulty in understanding what it is doing
list_1[-1::-1]

[1, 89, 12, 34]

#### List Creation

In [11]:
# using the append method
list_1 = []
for x in range(0, 10):
    list_1.append(x)

list_1

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [12]:
# using list comprehension
list_2 = [ x for x in range(0, 10)]
list_2

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [13]:
# using list comprehension with conditions
list_3 = [ x for x in range(0, 100) if x % 5 == 0]
list_3

[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]

In [14]:
# using list concatenation
list_1 = [ 1, 4, 56, -1 ]
list_2 = [ 1, 39, 245, -23, 0, 45 ]
list_3 = list_1 + list_2
print(f'list_1: {list_1}')
print(f'list_2: {list_2}')
print(f'list_3: {list_3}')

list_1: [1, 4, 56, -1]
list_2: [1, 39, 245, -23, 0, 45]
list_3: [1, 4, 56, -1, 1, 39, 245, -23, 0, 45]


In [15]:
# using extend
list_1 = [ 1, 2, 3 ]
list_2 = [ 'a', 'b' ]
list_1.extend(list_2)
list_1

[1, 2, 3, 'a', 'b']

> `list.extend` will modify the list that the operation is performed on

In [16]:
# Generate a list with random values
import random
list_1 = [ random.randint(0, 30) for x in range(0, 5) ]
list_1

[28, 3, 15, 14, 1]

##### List Iteration

In [17]:
# Iterating through the list using the while loop
list_1 = [ 1, 2, 3, 4 ]
i = 0
while i < len(list_1):
    print(list_1[i])
    i += 1

1
2
3
4


In [18]:
# Iterating through the list using the for in range(len(list)) [Not recomended since Python does not have index initialization, bounds checking, or index incrementing like traditional languages]
list_1 = [ x for x in range(0, 10) ]
for index in range(0, len(list_1)):
    print(list_1[index])

0
1
2
3
4
5
6
7
8
9


In [19]:
# Iterating through the list using the for [Recommended Pythonic way]
list_1 = [ x for x in range(0, 10) ]
for index in range(0, len(list_1)):
    print(list_1[index])

0
1
2
3
4
5
6
7
8
9


#### List Searching

In [20]:
# Found in list
list_1 = [ 1, 3, 4 ]
1 in list_1

True

In [21]:
# Not Found in list
list_1 = [ 1, 3, 4 ]
2 in list_1

False

#### List sorting

In [22]:
# Sorted in reverse order using reverse parameter
list_1 = [ i for i in range(0, 10) ]
sorted(list_1, reverse=True)

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [23]:
# Sort in reverse order using reverse parameter
list_1 = [ i for i in range(0, 10) ]
list_1.sort(reverse=True)
list_1

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [24]:
# Sort in reverse order using reverse method
list_1 = [ i for i in range(0, 10) ]
list_1.reverse()
list_1

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

> Note: Both `list.sort` and `list.reverse` do sorting in place and modify the original list. And `list.sort` also accepts an optional `key` parameter that allows overriding the default sorting algorthim. `sorted(list)` supports similar parameters as `list.sort`. Using `sorted(list)` is preferred since it minimizes side effect. Although in use cases where side effect does not cause issue `list.sort` is recommended since it need to allocate a new list which is expensive, and if memory is restricted it becomes a bigger concern

##### Activity 1: Handling Lists

In [25]:
# create a list of 100 random numbers
import random

LIMIT = 100

random_list = [ random.randint(0, LIMIT) for i in range(0, LIMIT) ]
random_list

[63,
 32,
 32,
 46,
 28,
 22,
 20,
 41,
 27,
 3,
 37,
 86,
 10,
 69,
 71,
 42,
 66,
 4,
 6,
 15,
 62,
 62,
 35,
 100,
 16,
 68,
 79,
 12,
 43,
 83,
 87,
 70,
 3,
 64,
 93,
 14,
 91,
 35,
 19,
 49,
 13,
 95,
 45,
 18,
 50,
 84,
 36,
 99,
 55,
 6,
 8,
 57,
 9,
 0,
 86,
 49,
 66,
 82,
 85,
 75,
 69,
 79,
 65,
 30,
 71,
 45,
 8,
 64,
 64,
 22,
 15,
 71,
 52,
 82,
 35,
 43,
 42,
 78,
 82,
 20,
 92,
 5,
 76,
 46,
 78,
 35,
 65,
 16,
 100,
 56,
 10,
 61,
 5,
 79,
 30,
 1,
 13,
 4,
 47,
 96]

In [26]:
# generate a new list of numbers divisible by 3 from random_list
divisible_by_3 = [ i for i in random_list if i % 3 == 0 ]
divisible_by_3

[63,
 27,
 3,
 69,
 42,
 66,
 6,
 15,
 12,
 87,
 3,
 93,
 45,
 18,
 84,
 36,
 99,
 6,
 57,
 9,
 0,
 66,
 75,
 69,
 30,
 45,
 15,
 42,
 78,
 78,
 30,
 96]

In [27]:
# calculate len of both list and store in a variable
diff_in_count = len(random_list) - len(divisible_by_3)
diff_in_count

68

In [28]:
NUMBER_OF_EXPERIMENTS = 10
difference_list = []
for i in range(0, 10):
    random_list = [ random.randint(0, LIMIT) for i in range(0, LIMIT) ]
    divisible_by_3 = [ i for i in random_list if i % 3 == 0 ]
    difference_list.append(len(random_list) - len(divisible_by_3))
    
avg_diff = sum(difference_list) / float(len(difference_list))
avg_diff

67.2

#### Set
A collection of distinct objects

##### Set Creation

In [29]:
# creating a set
list_1 = [ 1, 2, 2, 3, 1, 5 ]
set_1 = set(list_1)
set_1

{1, 2, 3, 5}

In [30]:
# creating a null set
null_set_1 = set({})
null_set_1

set()

> Note: This is different from creating a dictionary which is `dict_1 = {}`

##### Set Operation

In [31]:
# Union - Get a new set of all unique elements from unioning between 2 sets 
set_1 = { 'Apple', 'Orange', 'Pineapple' }
set_2 = { 'Mango', 'Orange', 'Coconut' }
set_1 | set_2

{'Apple', 'Coconut', 'Mango', 'Orange', 'Pineapple'}

In [32]:
# Intersection - Get a new set of all elements that belong to both 2 sets 
set_1 = { 'Apple', 'Orange', 'Pineapple' }
set_2 = { 'Mango', 'Orange', 'Coconut' }
set_1 & set_2

{'Orange'}

In [33]:
# Compliments - Get a new set of all elements that belong in either of the 2 sets but not both
set_1 = { 'Apple', 'Orange', 'Pineapple' }
set_2 = { 'Mango', 'Orange', 'Coconut' }
(set_1 | set_2) - (set_1 & set_2)

{'Apple', 'Coconut', 'Mango', 'Pineapple'}

#### Dictionary
It is similar to a list that it is a collection. But the collection is different in that it stores key-value pairs. The key is any value that is hashable

##### Dictionary Creation

In [34]:
# create a simple dictionary
dict_1 = { 'key1': 'value1', 'key2': 'value2' }
dict_1

{'key1': 'value1', 'key2': 'value2'}

In [35]:
# create a simple dictionary with different types
dict_2 = {
    'key1': 'value1', 
    'key2': [ 'list_element1', 34 ],
    'key3': 'value3',
    'key4': { 'subkey1': 'v1' },
    'key5': 4.5
}
dict_2

{'key1': 'value1',
 'key2': ['list_element1', 34],
 'key3': 'value3',
 'key4': {'subkey1': 'v1'},
 'key5': 4.5}

In [36]:
# create a dictionary using dict on a list of tuples
dict_2 = dict([
    ('Tom', 'Cat'),
    ('Jerry', 'Mouse'),
])
dict_2

{'Tom': 'Cat', 'Jerry': 'Mouse'}

In [37]:
# Non unique keys
dict_3 = { 'key1': 'value1', 'key1': 'value2' }
dict_3

{'key1': 'value2'}

> Note: Keys must be unique else the later declaration will override the value of the previous declaration

##### Dictionary Access

In [38]:
# Access an element in the dictionary using the key
dict_2['key2']

KeyError: 'key2'

In [39]:
# Override an element in the dictionary using the key
dict_2['key2'] = 'My new value'
dict_2['key2']

'My new value'

In [40]:
# Define a blank dictionary and use the key notation to assign values
dict_3 = {}
dict_3['key1'] = 'Value1'
dict_3

{'key1': 'Value1'}

##### Dictionary Iteration

In [41]:
# looping using for and dict.items()
dict_1 = {
    'key1': 'value1', 
    'key2': [ 'list_element1', 34 ],
    'key3': 'value3',
    'key4': { 'subkey1': 'v1' },
    'key5': 4.5
}

for k, v in dict_1.items():
    print(f'{k} - {v}')

key1 - value1
key2 - ['list_element1', 34]
key3 - value3
key4 - {'subkey1': 'v1'}
key5 - 4.5


##### Dictionary Delete

In [42]:
# Delete an element from the dictionary
dict_1 = {
    "key1": 1,
    "key2": ["list_element1", 34],
    "key3": "value3",
    "key4": {"subkey1": "v1"},
    "key5": 4.5
}

print(f'dict_1 pre delete: {dict_1}')
del dict_1['key2']
print(f'dict_1 post delete: {dict_1}')

dict_1 pre delete: {'key1': 1, 'key2': ['list_element1', 34], 'key3': 'value3', 'key4': {'subkey1': 'v1'}, 'key5': 4.5}
dict_1 post delete: {'key1': 1, 'key3': 'value3', 'key4': {'subkey1': 'v1'}, 'key5': 4.5}


##### Dictionary Comprehension

In [43]:
# using comprehension to generate a dictionary with x as key and x ** 2 as value
dict_1 = { x: x ** 2 for x in range(0, 10) }
dict_1

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

In [45]:
# using comprehension to generate a dictionary with x as key and x ** 2 as value
dict_2 = { x: x ** (1.0 / 2) for x in range(0, 10) }
dict_2

{0: 0.0,
 1: 1.0,
 2: 1.4142135623730951,
 3: 1.7320508075688772,
 4: 2.0,
 5: 2.23606797749979,
 6: 2.449489742783178,
 7: 2.6457513110645907,
 8: 2.8284271247461903,
 9: 3.0}

##### Dictionary Special Methods

In [46]:
# Uncommon use of dictionary to generate unique lists
import random
list_1 = [ random.randint(0, 3) for x in range(0, 100) ]
list(dict.fromkeys(list_1).keys())

[3, 1, 0, 2]

#### Tuples
A tuple is a sequence of immutable Python objects. Tuples are sequences just like list but they cannot be modified. The elements in the sequence are also declared using parentheses (optional if not empty) while arrays use square brackets.

##### Tuple Creation

In [47]:
# creating a tuple
tuple_1 = 24, 42, 2.3456, 'Hello'
tuple_1

(24, 42, 2.3456, 'Hello')

In [48]:
# creating an empty tuple
tuple_1 = ()
tuple_1

()

In [49]:
# creating a tuple with 1 value.
tuple_1 = 'Hello',
tuple_1

('Hello',)

> Note: the trailing comma is required for it to be declared as a tuple

In [50]:
# creating a nested tuple
tuple_1 = 'hello', 'there'
tuple_12 = tuple_1, 45, 'Sam'
tuple_12

(('hello', 'there'), 45, 'Sam')

##### Tuple Unpacking

In [51]:
tuple_1 = 'hello', 'world'
hello, world = tuple_1
print(f'hello: {hello}')
print(f'world: {world}')

hello: hello
world: world


##### Tuple Error

In [52]:
tuple_1 = 'hello', 'there'
tuple_1[1] = 'This is not allowed'

TypeError: 'tuple' object does not support item assignment

#### Strings
Strings are a sequence of characters and they can be defined using single or double quotes. But they must be the same meaning `''` and `""` are allowed while `'"` and `'"` are not allowed.

##### String Creation

In [None]:
# String creation using `''`
string1 = 'Hello World!'
string1

In [None]:
# String creation using `""`
string2 = "Hello World!"
string2

##### String Access

In [None]:
str_1 = 'Hello World!'

# Access first member
str_1[0]

In [None]:
# Access the fourth member
str_1[3]

In [None]:
# Access the last member
str_1[len(str_1) - 1]

In [None]:
# Access the last member using reverse index
str_1[-1]

> Note: Accessing a member of the string is like accessing a member of a list

##### String Slicing

In [None]:
str_1 = 'Hello World! I am learning data wrangling'

# length of string for reference
len(str_1)

In [None]:
# slice from member at index 2 till index 9
str_1[2:10]

In [None]:
# slice from member at index 9 till end of string. Using reverse indexing
str_1[-31:]

In [None]:
# slice using reverse indexing
str_1[-10:-5]

##### String Functions

In [53]:
# Getting the length
str_1 = 'Hello World! I am learning data wrangling'
len(str_1)

41

In [54]:
# lower case
str_1 = 'A COMPLETE {0} CASE STRING'

str_1.format('LOWER').lower()

'a complete lower case string'

In [55]:
# upper case
str_1.format('upper').upper()

'A COMPLETE UPPER CASE STRING'

In [56]:
# string search successful
str_1 = "A complicated string looks like this"
str_1.find('complicated')  # successful search returns index of first character found

2

In [57]:
# string search unsuccessful
str_1.find('Complicated')  # unsuccessful search returns -1

-1

In [58]:
# string replace successful
str_1 = "A complicated string looks like this"
str_1.replace('complicated', 'simple')

'A simple string looks like this'

In [59]:
# string replace unsuccessful
str_1 = "A complicated string looks like this"
str_1.replace('Complicated', 'simple')

'A complicated string looks like this'

In [60]:
# string split
str_1 = 'Name,Age,Sex,Address'
list_1 = str_1.split(',')
list_1

['Name', 'Age', 'Sex', 'Address']

In [61]:
# string join
" | ".join(list_1)

'Name | Age | Sex | Address'

##### Activity 2: Analyze a multiline string and generate the unique word count

In [62]:
import string

# create multiline string
multi_line_str = """
It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.

However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or other of their daughters.

"My dear Mr. Bennet," said his lady to him one day, "have you heard that Netherfield Park is let at last?"

Mr. Bennet replied that he had not.

"But it is," returned she; "for Mrs. Long has just been here, and she told me all about it."

Mr. Bennet made no answer.

"Do you not want to know who has taken it?" cried his wife impatiently.

"You want to tell me, and I have no objection to hearing it."

This was invitation enough.

"Why, my dear, you must know, Mrs. Long says that Netherfield is taken by a young man of large fortune from the north of England; that he came down on Monday in a chaise and four to see the place, and was so much delighted with it, that he agreed with Mr. Morris immediately; that he is to take possession before Michaelmas, and some of his servants are to be in the house by the end of next week."

"What is his name?"

"Bingley."

"Is he married or single?"

"Oh! Single, my dear, to be sure! A single man of large fortune; four or five thousand a year. What a fine thing for our girls!"

"How so? How can it affect them?"

"My dear Mr. Bennet," replied his wife, "how can you be so tiresome! You must know that I am thinking of his marrying one of them."

"Is that his design in settling here?"

"Design! Nonsense, how can you talk so! But it is very likely that he may fall in love with one of them, and therefore you must visit him as soon as he comes."

"I see no occasion for that. You and the girls may go, or you may send them by themselves, which perhaps will be still better, for as you are as handsome as any of them, Mr. Bingley may like you the best of the party."

"My dear, you flatter me. I certainly have had my share of beauty, but I do not pretend to be anything extraordinary now. When a woman has five grown-up daughters, she ought to give over thinking of her own beauty."

"In such cases, a woman has not often much beauty to think of."

"But, my dear, you must indeed go and see Mr. Bingley when he comes into the neighbourhood."

"It is more than I engage for, I assure you."

"But consider your daughters. Only think what an establishment it would be for one of them. Sir William and Lady Lucas are determined to go, merely on that account, for in general, you know, they visit no newcomers. Indeed you must go, for it will be impossible for us to visit him if you do not."

"You are over-scrupulous, surely. I dare say Mr. Bingley will be very glad to see you; and I will send a few lines by you to assure him of my hearty consent to his marrying whichever he chooses of the girls; though I must throw in a good word for my little Lizzy."

"I desire you will do no such thing. Lizzy is not a bit better than the others; and I am sure she is not half so handsome as Jane, nor half so good-humoured as Lydia. But you are always giving her the preference."

"They have none of them much to recommend them," replied he; "they are all silly and ignorant like other girls; but Lizzy has something more of quickness than her sisters."

"Mr. Bennet, how can you abuse your own children in such a way? You take delight in vexing me. You have no compassion for my poor nerves."

"You mistake me, my dear. I have a high respect for your nerves. They are my old friends. I have heard you mention them with consideration these last twenty years at least."

"Ah, you do not know what I suffer."

"But I hope you will get over it, and live to see many young men of four thousand a year come into the neighbourhood."

"It will be no use to us, if twenty such should come, since you will not visit them."

"Depend upon it, my dear, that when there are twenty, I will visit them all."

Mr. Bennet was so odd a mixture of quick parts, sarcastic humour, reserve, and caprice, that the experience of three-and-twenty years had been insufficient to make his wife understand his character. Her mind was less difficult to develop. She was a woman of mean understanding, little information, and uncertain temper. When she was discontented, she fancied herself nervous. The business of her life was to get her daughters married; its solace was visiting and news. 
"""
multi_line_str

'\nIt is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.\n\nHowever little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or other of their daughters.\n\n"My dear Mr. Bennet," said his lady to him one day, "have you heard that Netherfield Park is let at last?"\n\nMr. Bennet replied that he had not.\n\n"But it is," returned she; "for Mrs. Long has just been here, and she told me all about it."\n\nMr. Bennet made no answer.\n\n"Do you not want to know who has taken it?" cried his wife impatiently.\n\n"You want to tell me, and I have no objection to hearing it."\n\nThis was invitation enough.\n\n"Why, my dear, you must know, Mrs. Long says that Netherfield is taken by a young man of large fortune from the north of England; that he came down on Monday in a chaise

In [63]:
# remove \n
multi_line_str = multi_line_str.replace('\n', '')
multi_line_str

'It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or other of their daughters."My dear Mr. Bennet," said his lady to him one day, "have you heard that Netherfield Park is let at last?"Mr. Bennet replied that he had not."But it is," returned she; "for Mrs. Long has just been here, and she told me all about it."Mr. Bennet made no answer."Do you not want to know who has taken it?" cried his wife impatiently."You want to tell me, and I have no objection to hearing it."This was invitation enough."Why, my dear, you must know, Mrs. Long says that Netherfield is taken by a young man of large fortune from the north of England; that he came down on Monday in a chaise and four to see the place, and was so

In [64]:
# remove punctuation
multi_line_str.translate(str.maketrans('', '', string.punctuation))

'It is a truth universally acknowledged that a single man in possession of a good fortune must be in want of a wifeHowever little known the feelings or views of such a man may be on his first entering a neighbourhood this truth is so well fixed in the minds of the surrounding families that he is considered the rightful property of some one or other of their daughtersMy dear Mr Bennet said his lady to him one day have you heard that Netherfield Park is let at lastMr Bennet replied that he had notBut it is returned she for Mrs Long has just been here and she told me all about itMr Bennet made no answerDo you not want to know who has taken it cried his wife impatientlyYou want to tell me and I have no objection to hearing itThis was invitation enoughWhy my dear you must know Mrs Long says that Netherfield is taken by a young man of large fortune from the north of England that he came down on Monday in a chaise and four to see the place and was so much delighted with it that he agreed with

In [65]:
# count
word_count = {}

for word in multi_line_str.split(' '):
    word = word.lower()
    if word != '':
        if word in word_count:
            word_count[word]  += 1
        else:
            word_count[word]  = 1
        
word_count

{'it': 6,
 'is': 11,
 'a': 21,
 'truth': 2,
 'universally': 1,
 'acknowledged,': 1,
 'that': 14,
 'single': 2,
 'man': 4,
 'in': 11,
 'possession': 2,
 'of': 28,
 'good': 2,
 'fortune,': 1,
 'must': 7,
 'be': 11,
 'want': 3,
 'wife.however': 1,
 'little': 3,
 'known': 1,
 'the': 18,
 'feelings': 1,
 'or': 5,
 'views': 1,
 'such': 5,
 'may': 5,
 'on': 3,
 'his': 11,
 'first': 1,
 'entering': 1,
 'neighbourhood,': 1,
 'this': 1,
 'so': 6,
 'well': 1,
 'fixed': 1,
 'minds': 1,
 'surrounding': 1,
 'families,': 1,
 'he': 10,
 'considered': 1,
 'rightful': 1,
 'property': 1,
 'some': 2,
 'one': 5,
 'other': 2,
 'their': 1,
 'daughters."my': 1,
 'dear': 2,
 'mr.': 6,
 'bennet,"': 2,
 'said': 1,
 'lady': 2,
 'to': 22,
 'him': 4,
 'day,': 1,
 '"have': 1,
 'you': 26,
 'heard': 2,
 'netherfield': 2,
 'park': 1,
 'let': 1,
 'at': 2,
 'last?"mr.': 1,
 'bennet': 3,
 'replied': 3,
 'had': 3,
 'not."but': 1,
 'is,"': 1,
 'returned': 1,
 'she;': 1,
 '"for': 1,
 'mrs.': 2,
 'long': 2,
 'has': 5,
 'just'

In [68]:
from itertools import islice

# sorted
sorted_word_count = sorted(word_count.items(), key=lambda key_val_tuple: key_val_tuple[1], reverse=True)
sorted_word_count[:25]

[('of', 28),
 ('you', 26),
 ('to', 22),
 ('a', 21),
 ('the', 18),
 ('and', 16),
 ('i', 15),
 ('that', 14),
 ('is', 11),
 ('in', 11),
 ('be', 11),
 ('his', 11),
 ('he', 10),
 ('my', 10),
 ('for', 10),
 ('will', 9),
 ('was', 8),
 ('are', 8),
 ('must', 7),
 ('no', 7),
 ('not', 7),
 ('as', 7),
 ('it', 6),
 ('so', 6),
 ('mr.', 6)]