Skip to content

Commit

Permalink
Started filling in the text instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
palewire committed Feb 13, 2017
1 parent ffe346f commit d2e4a66
Show file tree
Hide file tree
Showing 7 changed files with 45 additions and 41 deletions.
28 changes: 6 additions & 22 deletions First Python Notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1266,7 +1266,7 @@
},
{
"cell_type": "code",
"execution_count": 49,
"execution_count": 87,
"metadata": {
"collapsed": false,
"deletable": true,
Expand All @@ -1276,33 +1276,16 @@
{
"data": {
"text/plain": [
"PROPOSITION 057 - CRIMINAL SENTENCES. JUVENILE CRIMINAL PROCEEDINGS AND SENTENCING. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 13\n",
"PROPOSITION 056 - CIGARETTE TAX TO FUND HEALTHCARE, TOBACCO USE PREVENTION, RESEARCH, AND LAW ENFORCEMENT. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 12\n",
"PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE. 11\n",
"PROPOSITION 066- DEATH PENALTY. PROCEDURES. INITIATIVE STATUTE. 9\n",
"PROPOSITION 055 - TAX EXTENSION TO FUND EDUCATION AND HEALTHCARE. INITIATIVE CONSTITUTIONAL AMENDMENT. 8\n",
"PROPOSITION 067- REFERENDUM TO OVERTURN BAN ON SINGLE-USE PLASTIC BAGS. 7\n",
"PROPOSITION 062- DEATH PENALTY. INITIATIVE STATUTE. 7\n",
"PROPOSITION 059- SB 254 (CHAPTER 20, STATUTES OF 2016), ALLEN. CAMPAIGN FINANCE: VOTER INSTRUCTION 6\n",
"PROPOSITION 053 - REVENUE BONDS. STATEWIDE VOTER APPROVAL. INITIATIVE CONSTITUTIONAL AMENDMENT. 4\n",
"PROPOSITION 054 - LEGISLATURE. LEGISLATION AND PROCEEDINGS. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 4\n",
"PROPOSITION 058 - SB 1174 (CHAPTER 753, STATUTES OF 2014), LARA. ENGLISH LANGUAGE EDUCATION 4\n",
"PROPOSITION 063- FIREARMS. AMMUNITION SALES. INTIATIVE STATUTE. 4\n",
"PROPOSITION 051 - SCHOOL BONDS. FUNDING FOR K-12 SCHOOL AND COMMUNITY COLLEGE FACILITIES. INITIATIVE STATUTORY AMENDMENT. 4\n",
"PROPOSITION 052 - STATE FEES ON HOSPITALS. FEDERAL MEDI-CAL MATCHING FUNDS. INITIATIVE STATUTORY AND CONSTITUTIONAL AMENDMENT. 3\n",
"PROPOSITION 061- STATE PRESCRIPTION DRUG PURCHASES. PRICING STANDARDS. INITIATIVE STATUTE. 3\n",
"PROPOSITION 060- ADULT FILMS. CONDOMS. HEALTH REQUIREMENTS. INITIATIVE STATUTE. 2\n",
"PROPOSITION 065- CARRY-OUT BAGS. CHARGES. INITIATIVE STATUTE. 1\n",
"Name: prop_name, dtype: int64"
"pandas.core.series.Series"
]
},
"execution_count": 49,
"execution_count": 87,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"props.prop_name.value_counts()"
"type(props.prop_name.value_counts())"
]
},
{
Expand Down Expand Up @@ -1702,7 +1685,8 @@
"cell_type": "code",
"execution_count": 54,
"metadata": {
"collapsed": false
"collapsed": false,
"scrolled": true
},
"outputs": [
{
Expand Down
Binary file modified docs/_build_html/.doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/_build_html/.doctrees/index.doctree
Binary file not shown.
20 changes: 14 additions & 6 deletions docs/_build_html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -302,45 +302,53 @@ Quick studies will have already noted the ``prop_name`` column where each commit
.. image:: /_static/column.png

TK
One of the many cool tricks built in to ``pandas`` is the ability to total up the frequency of values in a column with the `value_counts <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html>`_ method. We can use it here to total up how many committees were active for each proposition.

.. code-block:: python
props.prop_name.value_counts()
.. image:: /_static/value_counts.png

TK
You may have noticed that both of the previous methods did not return a clean looking table in the same way as ``head``. It's often hard to anticipate, but in these cases and many others ``pandas`` will sometimes return an ugly `Series <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html>`_ rather than more aesthetically pleasing (and powerful) ``DataFrame``.

If that sounds like a bunch of mumbo jumbo, that's because it is! Like most computer programming tools, ``pandas`` has its own odd quirks that you have to pick up as you go. The difference between a ``Series`` and a ``DataFrame`` is one of those. The key is to not worry about it too much and keep hacking.

In most instances, if you have an ugly series generated by a method like ``value_counts`` and you want to convert it into a ``DataFrame`` you can do so by tacking on the `reset_index <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.reset_index.html>`_ method onto the tail end. Why? Again the answer is "because ``pandas`` says so." So let's play along.

.. code-block:: python
props.prop_name.value_counts().reset_index()
.. image:: /_static/value_counts_df.png

TK
Now that we've seen all the propositions in the dataset, we're ready to take a crucial step towards our goal by filtering the list down to just those committees that supported or opposed Proposition 64.

We can do that by copying the full name of the proposition that appears in the dataset and inserting it into the following statement, which follows the ``pandas`` system for filtering a ``DataFrame``.

You start with the variable you want to filter, and then create an evaluation by combining a column with an `"operator" <https://en.wikipedia.org/wiki/Operator_(computer_programming)>`_ like ``==`` or ``>`` or ``<`` with a value to compare the field against.

.. code-block:: python
props[props.prop_name == 'PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.']
.. image:: /_static/prop_filter.png

TK
Now that we've seen what it outputs, we should save the results of that filter into new variable separate from the full list we imported from the CSV file.

.. code-block:: python
prop = props[props.prop_name == 'PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.']
TK
The find out how many records are left after the filter, we can use Python's built-in `len <https://docs.python.org/2/library/functions.html#len>`_ function to inspect our new variable.

.. code-block:: python
len(prop)
.. image:: /_static/prop_len.png

TK
With that we're ready to move on to a related, similar task: Importing all of the individual contributions reported to last year's 17 ballot measures and filtering them down to just those supporting and opposing Proposition 64.

.. code-block:: python
Expand Down
16 changes: 10 additions & 6 deletions docs/_build_html/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -333,31 +333,35 @@ <h2>Act 3: Hello data<a class="headerlink" href="#act-3-hello-data" title="Perma
</pre></div>
</div>
<img alt="_images/column.png" src="_images/column.png" />
<p>TK</p>
<p>One of the many cool tricks built in to <code class="docutils literal"><span class="pre">pandas</span></code> is the ability to total up the frequency of values in a column with the <a class="reference external" href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html">value_counts</a> method. We can use it here to total up how many committees were active for each proposition.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">props</span><span class="o">.</span><span class="n">prop_name</span><span class="o">.</span><span class="n">value_counts</span><span class="p">()</span>
</pre></div>
</div>
<img alt="_images/value_counts.png" src="_images/value_counts.png" />
<p>TK</p>
<p>You may have noticed that both of the previous methods did not return a clean looking table in the same way as <code class="docutils literal"><span class="pre">head</span></code>. It&#8217;s often hard to anticipate, but in these cases and many others <code class="docutils literal"><span class="pre">pandas</span></code> will sometimes return an ugly <a class="reference external" href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html">Series</a> rather than more aesthetically pleasing (and powerful) <code class="docutils literal"><span class="pre">DataFrame</span></code>.</p>
<p>If that sounds like a bunch of mumbo jumbo, that&#8217;s because it is! Like most computer programming tools, <code class="docutils literal"><span class="pre">pandas</span></code> has its own odd quirks that you have to pick up as you go. The difference between a <code class="docutils literal"><span class="pre">Series</span></code> and a <code class="docutils literal"><span class="pre">DataFrame</span></code> is one of those. The key is to not worry about it too much and keep hacking.</p>
<p>In most instances, if you have an ugly series generated by a method like <code class="docutils literal"><span class="pre">value_counts</span></code> and you want to convert it into a <code class="docutils literal"><span class="pre">DataFrame</span></code> you can do so by tacking on the <a class="reference external" href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.reset_index.html">reset_index</a> method onto the tail end. Why? Again the answer is &#8220;because <code class="docutils literal"><span class="pre">pandas</span></code> says so.&#8221; So let&#8217;s play along.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">props</span><span class="o">.</span><span class="n">prop_name</span><span class="o">.</span><span class="n">value_counts</span><span class="p">()</span><span class="o">.</span><span class="n">reset_index</span><span class="p">()</span>
</pre></div>
</div>
<img alt="_images/value_counts_df.png" src="_images/value_counts_df.png" />
<p>TK</p>
<p>Now that we&#8217;ve seen all the propositions in the dataset, we&#8217;re ready to take a crucial step towards our goal by filtering the list down to just those committees that supported or opposed Proposition 64.</p>
<p>We can do that by copying the full name of the proposition that appears in the dataset and inserting it into the following statement, which follows the <code class="docutils literal"><span class="pre">pandas</span></code> system for filtering a <code class="docutils literal"><span class="pre">DataFrame</span></code>.</p>
<p>You start with the variable you want to filter, and then create an evaluation by combining a column with an <a class="reference external" href="https://en.wikipedia.org/wiki/Operator_(computer_programming)">&#8220;operator&#8221;</a> like <code class="docutils literal"><span class="pre">==</span></code> or <code class="docutils literal"><span class="pre">&gt;</span></code> or <code class="docutils literal"><span class="pre">&lt;</span></code> with a value to compare the field against.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">props</span><span class="p">[</span><span class="n">props</span><span class="o">.</span><span class="n">prop_name</span> <span class="o">==</span> <span class="s1">&#39;PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.&#39;</span><span class="p">]</span>
</pre></div>
</div>
<img alt="_images/prop_filter.png" src="_images/prop_filter.png" />
<p>TK</p>
<p>Now that we&#8217;ve seen what it outputs, we should save the results of that filter into new variable separate from the full list we imported from the CSV file.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">prop</span> <span class="o">=</span> <span class="n">props</span><span class="p">[</span><span class="n">props</span><span class="o">.</span><span class="n">prop_name</span> <span class="o">==</span> <span class="s1">&#39;PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.&#39;</span><span class="p">]</span>
</pre></div>
</div>
<p>TK</p>
<p>The find out how many records are left after the filter, we can use Python&#8217;s built-in <a class="reference external" href="https://docs.python.org/2/library/functions.html#len">len</a> function to inspect our new variable.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="nb">len</span><span class="p">(</span><span class="n">prop</span><span class="p">)</span>
</pre></div>
</div>
<img alt="_images/prop_len.png" src="_images/prop_len.png" />
<p>TK</p>
<p>With that we&#8217;re ready to move on to a related, similar task: Importing all of the individual contributions reported to last year&#8217;s 17 ballot measures and filtering them down to just those supporting and opposing Proposition 64.</p>
<div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">contribs</span> <span class="o">=</span> <span class="n">pandas</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">&quot;contributions.csv&quot;</span><span class="p">)</span>
</pre></div>
</div>
Expand Down

0 comments on commit d2e4a66

Please sign in to comment.