/
Expressions.html
145 lines (144 loc) · 10.4 KB
/
Expressions.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
<div id="ipython-notebook">
<a class="interact-button" href="http://datahub.berkeley.edu/user-redirect/interact?repo=textbook&path=notebooks/Expressions.ipynb">Interact</a>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {
inlineMath: [['$','$']],
processEscapes: true
}
});
</script>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Programming languages are much simpler than human languages. Nonetheless, there are some rules of grammar to learn in any language, and that is where we will begin. In this text, we will use the <a href="https://www.python.org/">Python</a> programming language. Learning the grammar rules is essential, and the same rules used in the most basic programs are also central to more sophisticated programs.</p>
<p>Programs are made up of <em>expressions</em>, which describe to the computer how to combine pieces of data. For example, a multiplication expression consists of a <code>*</code> symbol between two numerical expressions. Expressions, such as <code>3 * 4</code>, are <em>evaluated</em> by the computer. The value (the result of <em>evaluation</em>) of the last expression in each cell, <code>12</code> in this case, is displayed below the cell.</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="mi">3</span> <span class="o">*</span> <span class="mi">4</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>12</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>The grammar rules of a programming language are rigid. In Python, the <code>*</code> symbol cannot appear twice in a row. The computer will not try to interpret an expression that differs from its prescribed expression structures. Instead, it will show a <code>SyntaxError</code> error. The <em>Syntax</em> of a language is its set of grammar rules, and a <code>SyntaxError</code> indicates that an expression structure doesn't match any of the rules of the language.</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="mi">3</span> <span class="o">*</span> <span class="o">*</span> <span class="mi">4</span>
</pre></div></div></div>
<div class="output_subarea output_text output_error">
<pre><span class="ansi-cyan-fg"> File </span><span class="ansi-green-fg">"<ipython-input-4-d90564f70db7>"</span><span class="ansi-cyan-fg">, line </span><span class="ansi-green-fg">1</span>
<span class="ansi-red-fg"> 3 * * 4</span>
^
<span class="ansi-red-fg">SyntaxError</span><span class="ansi-red-fg">:</span> invalid syntax
</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Small changes to an expression can change its meaning entirely. Below, the space between the <code>*</code>'s has been removed. Because <code>**</code> appears between two numerical expressions, the expression is a well-formed <em>exponentiation</em> expression (the first number raised to the power of the second: 3 times 3 times 3 times 3). The symbols <code>*</code> and <code>**</code> are called <em>operators</em>, and the values they combine are called <em>operands</em>.</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="mi">3</span> <span class="o">**</span> <span class="mi">4</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>81</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p><strong>Common Operators.</strong> Data science often involves combining numerical values, and the set of operators in a programming language are designed to so that expressions can be used to express any sort of arithmetic. In Python, the following operators are essential.</p>
<table>
<thead><tr>
<th>Expression Type</th>
<th>Operator</th>
<th>Example</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Addition</td>
<td><code>+</code></td>
<td><code>2 + 3</code></td>
<td><code>5</code></td>
</tr>
<tr>
<td>Subtraction</td>
<td><code>-</code></td>
<td><code>2 - 3</code></td>
<td><code>-1</code></td>
</tr>
<tr>
<td>Multiplication</td>
<td><code>*</code></td>
<td><code>2 * 3</code></td>
<td><code>6</code></td>
</tr>
<tr>
<td>Division</td>
<td><code>/</code></td>
<td><code>7 / 3</code></td>
<td><code>2.66667</code></td>
</tr>
<tr>
<td>Remainder</td>
<td><code>%</code></td>
<td><code>7 % 3</code></td>
<td><code>1</code></td>
</tr>
<tr>
<td>Exponentiation</td>
<td><code>**</code></td>
<td><code>2 ** 0.5</code></td>
<td><code>1.41421</code></td>
</tr>
</tbody>
</table></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Python expressions obey the same familiar rules of <em>precedence</em> as in algebra: multiplication and division occur before addition and subtraction. Parentheses can be used to group together smaller expressions within a larger expression.</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="mi">1</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="mi">3</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">*</span> <span class="mi">5</span> <span class="o">/</span> <span class="mi">6</span> <span class="o">**</span> <span class="mi">3</span> <span class="o">+</span> <span class="mi">7</span> <span class="o">+</span> <span class="mi">8</span> <span class="o">-</span> <span class="mi">9</span> <span class="o">+</span> <span class="mi">10</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>17.555555555555557</pre></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="mi">1</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="p">(</span><span class="mi">3</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">*</span> <span class="mi">5</span> <span class="o">/</span> <span class="mi">6</span><span class="p">)</span> <span class="o">**</span> <span class="mi">3</span> <span class="o">+</span> <span class="mi">7</span> <span class="o">+</span> <span class="mi">8</span> <span class="o">-</span> <span class="mi">9</span> <span class="o">+</span> <span class="mi">10</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>2017.0</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h3 id="Example">Example<a class="anchor-link" href="#Example">¶</a></h3><p>Here, from the Washington Post in the early 1980s, is a graph that attempts to compare the earnings of doctors with the earnings of other professionals over a few decades. Do we really need to see two heads (one with a stethoscope) on each bar? Edward Tufte, Professor at Yale and one of the world's experts on visualizing quantitative information, coined the term "chartjunk" for such unnecessary embellishments. This graph is also an example of the "low data-to-ink ratio" that Tufte deplores.</p></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p><img alt="Washington Post graph" src="/images/bad_post_graph.png"/></p></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Most importantly, the horizontal axis of the graph is is not drawn to scale. This has a significant effect on the shape of the bar graphs. When drawn to scale and shorn of decoration, the graphs reveal trends that are quite different from the apparently linear growth in the original. The elegant graph below is due to Ross Ihaka, one of the originators of the statistical system R.</p></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p><img alt="Ross Ihaka's version of Post graph" src="/images/ihaka_fixed_post_graph.png"/></p></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>In the period 1939 to 1963, the doctors' incomes went up from \$3,262 to \$25,050. So during that period the average increase in income per year was about \$900.</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="p">(</span><span class="mi">25050</span> <span class="o">-</span> <span class="mi">3262</span><span class="p">)</span><span class="o">/</span><span class="p">(</span><span class="mi">1963</span> <span class="o">-</span> <span class="mi">1939</span><span class="p">)</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>907.8333333333334</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>In Ross Ihaka's graph you can see that in this period, the doctors' incomes rise roughly linearly at a fairly steady rate. That rate is about \$900, as we have just calculated.</p>
<p>But in the period 1963 to 1976, the rate is more than three times as high:</p></div></div>
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="p">(</span><span class="mi">62799</span> <span class="o">-</span> <span class="mi">25050</span><span class="p">)</span><span class="o">/</span><span class="p">(</span><span class="mi">1976</span> <span class="o">-</span> <span class="mi">1963</span><span class="p">)</span>
</pre></div></div></div>
<div class="output_text output_subarea output_execute_result">
<pre>2903.769230769231</pre></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>That is why the graph rises much more steeply after 1963.</p></div></div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>This chapter introduces many types of expressions. Learning to program involves trying out everything you learn in combination, investigating the behavior of the computer. What happens if you divide by zero? What happens if you divide twice in a row? You don't always need to ask an expert (or the Internet); many of these details can be discovered by trying them out yourself.</p></div></div></div>