forked from fluentpython/example-code
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3 from juanshishido/master
Chapter 3: dictionaries
- Loading branch information
Showing
2 changed files
with
367 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,166 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Dial codes and Dictionaries\n", | ||
"\n", | ||
"This notebook contains example code from [*Fluent Python*](http://shop.oreilly.com/product/0636920032519.do), by Luciano Ramalho.\n", | ||
"\n", | ||
"Code by Luciano Ramalho, modified by Allen Downey.\n", | ||
"\n", | ||
"MIT License: https://opensource.org/licenses/MIT" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Below, we'll show what happens when we create dictionaries using the data in `DIAL_CODES` when that data is sorted in different ways.\n", | ||
"\n", | ||
"We'll start by creating the data—a list of tuples with country codes and country names." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# dial codes of the top 10 most populous countries\n", | ||
"DIAL_CODES = [\n", | ||
" (86, 'China'),\n", | ||
" (91, 'India'),\n", | ||
" (1, 'United States'),\n", | ||
" (62, 'Indonesia'),\n", | ||
" (55, 'Brazil'),\n", | ||
" (92, 'Pakistan'),\n", | ||
" (880, 'Bangladesh'),\n", | ||
" (234, 'Nigeria'),\n", | ||
" (7, 'Russia'),\n", | ||
" (81, 'Japan'),\n", | ||
" ]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We can create a Python dictionary using the `dict()` function, with `DIAL_CODES` as the argument. Using the `.keys()` method, we can get a list of all of `d1`'s keys." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"d1: dict_keys([880, 1, 86, 55, 7, 234, 91, 92, 62, 81])\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"d1 = dict(DIAL_CODES)\n", | ||
"print('d1:', d1.keys())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Notice that the keys are *not* in the same order as in `DIAL_CODES`.\n", | ||
"\n", | ||
"Let's create two more dictionaties and sort the input data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"d2: dict_keys([880, 1, 91, 86, 81, 55, 234, 7, 92, 62])\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"d2 = dict(sorted(DIAL_CODES))\n", | ||
"print('d2:', d2.keys())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"d3: dict_keys([880, 81, 1, 86, 55, 7, 234, 91, 92, 62])\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"d3 = dict(sorted(DIAL_CODES, key=lambda x:x[1]))\n", | ||
"print('d3:', d3.keys())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Again, we see the keys are not in the same order as in `DIAL_CODES` or as in `d1`.\n", | ||
"\n", | ||
"However, the three dictionaries compare equal." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"assert d1 == d2 and d2 == d3" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Handling Missing Keys with `setdefault`\n", | ||
"\n", | ||
"This notebook contains example code from [*Fluent Python*](http://shop.oreilly.com/product/0636920032519.do), by Luciano Ramalho.\n", | ||
"\n", | ||
"Code by Luciano Ramalho, modified by Allen Downey.\n", | ||
"\n", | ||
"MIT License: https://opensource.org/licenses/MIT" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In this notebook, we show two ways to create a mapping of words and their occurrences. For each word in a text file, we create a list of tuples, one for each occurrence. The tuple values represent the position (line and column) of the word in the file. (Note that the line and column positions are indexed starting at one.)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import os\n", | ||
"import sys\n", | ||
"import re" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"file_name = 'text.txt'" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# temporary text file\n", | ||
"with open(file_name, 'w') as f:\n", | ||
" f.write('Fluent Python notebooks\\nJupyter notebooks')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"WORD_RE = re.compile('\\w+')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# using `.get()` method\n", | ||
"index = {}\n", | ||
"with open(file_name) as fp:\n", | ||
" for line_no, line in enumerate(fp, 1):\n", | ||
" for match in WORD_RE.finditer(line):\n", | ||
" word = match.group()\n", | ||
" column_no = match.start()+1\n", | ||
" location = (line_no, column_no)\n", | ||
" # this is ugly; coded like this to make a point\n", | ||
" occurrences = index.get(word, []) # <1>\n", | ||
" occurrences.append(location) # <2>\n", | ||
" index[word] = occurrences # <3>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Fluent [(1, 1)]\n", | ||
"Jupyter [(2, 1)]\n", | ||
"notebooks [(1, 15), (2, 9)]\n", | ||
"Python [(1, 8)]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"# print in alphabetical order\n", | ||
"for word in sorted(index, key=str.upper):\n", | ||
" print(word, index[word])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# better solution\n", | ||
"# using `.setdefault()` method\n", | ||
"index = {}\n", | ||
"with open(file_name) as fp:\n", | ||
" for line_no, line in enumerate(fp, 1):\n", | ||
" for match in WORD_RE.finditer(line):\n", | ||
" word = match.group()\n", | ||
" column_no = match.start()+1\n", | ||
" location = (line_no, column_no)\n", | ||
" index.setdefault(word, []).append(location) # <1>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Fluent [(1, 1)]\n", | ||
"Jupyter [(2, 1)]\n", | ||
"notebooks [(1, 15), (2, 9)]\n", | ||
"Python [(1, 8)]\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"# print in alphabetical order\n", | ||
"for word in sorted(index, key=str.upper):\n", | ||
" print(word, index[word])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This shows us that both blocks of code do the same things. However, we use less lines of code with the `.setdefault()` method and it's also more efficient, using a single lookup as opposed up to three with the `.get()` method." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"os.remove('text.txt')" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |