Skip to content

Commit

Permalink
Merge pull request #3 from juanshishido/master
Browse files Browse the repository at this point in the history
Chapter 3: dictionaries
  • Loading branch information
AllenDowney committed Jan 22, 2017
2 parents f7998f2 + a3099a5 commit b71e738
Show file tree
Hide file tree
Showing 2 changed files with 367 additions and 0 deletions.
166 changes: 166 additions & 0 deletions 03-dict-set/dialcodes.ipynb
@@ -0,0 +1,166 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Dial codes and Dictionaries\n",
"\n",
"This notebook contains example code from [*Fluent Python*](http://shop.oreilly.com/product/0636920032519.do), by Luciano Ramalho.\n",
"\n",
"Code by Luciano Ramalho, modified by Allen Downey.\n",
"\n",
"MIT License: https://opensource.org/licenses/MIT"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below, we'll show what happens when we create dictionaries using the data in `DIAL_CODES` when that data is sorted in different ways.\n",
"\n",
"We'll start by creating the data—a list of tuples with country codes and country names."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# dial codes of the top 10 most populous countries\n",
"DIAL_CODES = [\n",
" (86, 'China'),\n",
" (91, 'India'),\n",
" (1, 'United States'),\n",
" (62, 'Indonesia'),\n",
" (55, 'Brazil'),\n",
" (92, 'Pakistan'),\n",
" (880, 'Bangladesh'),\n",
" (234, 'Nigeria'),\n",
" (7, 'Russia'),\n",
" (81, 'Japan'),\n",
" ]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can create a Python dictionary using the `dict()` function, with `DIAL_CODES` as the argument. Using the `.keys()` method, we can get a list of all of `d1`'s keys."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d1: dict_keys([880, 1, 86, 55, 7, 234, 91, 92, 62, 81])\n"
]
}
],
"source": [
"d1 = dict(DIAL_CODES)\n",
"print('d1:', d1.keys())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that the keys are *not* in the same order as in `DIAL_CODES`.\n",
"\n",
"Let's create two more dictionaties and sort the input data."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d2: dict_keys([880, 1, 91, 86, 81, 55, 234, 7, 92, 62])\n"
]
}
],
"source": [
"d2 = dict(sorted(DIAL_CODES))\n",
"print('d2:', d2.keys())"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d3: dict_keys([880, 81, 1, 86, 55, 7, 234, 91, 92, 62])\n"
]
}
],
"source": [
"d3 = dict(sorted(DIAL_CODES, key=lambda x:x[1]))\n",
"print('d3:', d3.keys())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, we see the keys are not in the same order as in `DIAL_CODES` or as in `d1`.\n",
"\n",
"However, the three dictionaries compare equal."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"assert d1 == d2 and d2 == d3"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
201 changes: 201 additions & 0 deletions 03-dict-set/index.ipynb
@@ -0,0 +1,201 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Handling Missing Keys with `setdefault`\n",
"\n",
"This notebook contains example code from [*Fluent Python*](http://shop.oreilly.com/product/0636920032519.do), by Luciano Ramalho.\n",
"\n",
"Code by Luciano Ramalho, modified by Allen Downey.\n",
"\n",
"MIT License: https://opensource.org/licenses/MIT"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we show two ways to create a mapping of words and their occurrences. For each word in a text file, we create a list of tuples, one for each occurrence. The tuple values represent the position (line and column) of the word in the file. (Note that the line and column positions are indexed starting at one.)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"import re"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"file_name = 'text.txt'"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# temporary text file\n",
"with open(file_name, 'w') as f:\n",
" f.write('Fluent Python notebooks\\nJupyter notebooks')"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"WORD_RE = re.compile('\\w+')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# using `.get()` method\n",
"index = {}\n",
"with open(file_name) as fp:\n",
" for line_no, line in enumerate(fp, 1):\n",
" for match in WORD_RE.finditer(line):\n",
" word = match.group()\n",
" column_no = match.start()+1\n",
" location = (line_no, column_no)\n",
" # this is ugly; coded like this to make a point\n",
" occurrences = index.get(word, []) # <1>\n",
" occurrences.append(location) # <2>\n",
" index[word] = occurrences # <3>"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fluent [(1, 1)]\n",
"Jupyter [(2, 1)]\n",
"notebooks [(1, 15), (2, 9)]\n",
"Python [(1, 8)]\n"
]
}
],
"source": [
"# print in alphabetical order\n",
"for word in sorted(index, key=str.upper):\n",
" print(word, index[word])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# better solution\n",
"# using `.setdefault()` method\n",
"index = {}\n",
"with open(file_name) as fp:\n",
" for line_no, line in enumerate(fp, 1):\n",
" for match in WORD_RE.finditer(line):\n",
" word = match.group()\n",
" column_no = match.start()+1\n",
" location = (line_no, column_no)\n",
" index.setdefault(word, []).append(location) # <1>"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fluent [(1, 1)]\n",
"Jupyter [(2, 1)]\n",
"notebooks [(1, 15), (2, 9)]\n",
"Python [(1, 8)]\n"
]
}
],
"source": [
"# print in alphabetical order\n",
"for word in sorted(index, key=str.upper):\n",
" print(word, index[word])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This shows us that both blocks of code do the same things. However, we use less lines of code with the `.setdefault()` method and it's also more efficient, using a single lookup as opposed up to three with the `.get()` method."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"os.remove('text.txt')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

0 comments on commit b71e738

Please sign in to comment.