Skip to content

Commit

Permalink
QA notes from first half of Dave thoughts
Browse files Browse the repository at this point in the history
  • Loading branch information
craigsdennis committed Nov 6, 2018
1 parent 042cedc commit 09f8e6e
Show file tree
Hide file tree
Showing 10 changed files with 151 additions and 121 deletions.
4 changes: 2 additions & 2 deletions data/creation.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ def user_dict(self):

user_row_dict['adrian'] = {
'first_name': 'Adrian',
'last_name': 'Yang',
'email': 'adrian.yang@teamtreehouse.com',
'last_name': 'Fang',
'email': 'adrian.fang@teamtreehouse.com',
'email_verified': fake.email_verified(),
'signup_date': fake.signup_date(),
'referral_count': fake.random_int(0, 7),
Expand Down
2 changes: 1 addition & 1 deletion data/users.csv
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
aaron,Aaron,Davis,aaron6348@gmail.com,True,2018-08-31,6,18.14
acook,Anthony,Cook,cook@gmail.com,True,2018-05-12,2,55.45
adam.saunders,Adam,Saunders,adam@gmail.com,False,2018-05-29,3,72.12
adrian,Adrian,Yang,adrian.yang@teamtreehouse.com,True,2018-04-28,3,30.01
adrian,Adrian,Fang,adrian.fang@teamtreehouse.com,True,2018-04-28,3,30.01
adrian.blair,Adrian,Blair,adrian9335@gmail.com,True,2018-06-16,7,25.85
alan9443,Alan,Pope,pope@hotmail.com,True,2018-04-17,0,56.09
alexander7808,Alexander,Moore,alexander.moore@gmail.com,False,2018-03-27,2,87.71
Expand Down
18 changes: 14 additions & 4 deletions s1n01-creating-a-series.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,10 @@
"source": [
"## Creating from a dictionary\n",
"\n",
"Let's use this sample data here. In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
"\n",
"Let's use this sample data here we got from CashBox. They want to track the balances of their users. This is how much money each user currently has in their account. CashBox requires that users create a username.\n",
"\n",
"In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
]
},
{
Expand Down Expand Up @@ -164,7 +167,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Note, the order of the labels is guaranteed. "
"Note, the order of the labels is guaranteed to match the same order of the supplied index. "
]
},
{
Expand Down Expand Up @@ -195,7 +198,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together."
"One thing to remember is that a NumPy array is also iterable, so you can create a new `Series` from an `ndarray`. In fact, you'll find NumPy and Pandas get along very well together."
]
},
{
Expand Down Expand Up @@ -229,7 +232,7 @@
"source": [
"## Creating from a scalar and an index\n",
"\n",
"If you pass in a scalar that value will be broadcasted to the keys specified in the index argument"
"If you pass in a scalar, remember that is a single value, it will be broadcast to each of the keys specified in the `index` keyword argument."
]
},
{
Expand Down Expand Up @@ -257,6 +260,13 @@
"pd.Series(20.00, index=[\"guil\", \"jay\", \"james\", \"ben\", \"nick\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In other words, each key is assigned the same scalar value for the entire `Series`."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
77 changes: 68 additions & 9 deletions s1n02-accessing-a-series.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
"source": [
"# Accessing a Series\n",
"\n",
"There are multiple ways to get to the data that is stored in your `Series`. Let's explore the **`balances`** `Series`. \n",
"There are multiple ways to get to the data stored in your `Series`. Let's explore the **`balances`** `Series`. \n",
"\n",
"Remember, the `Series` is indexed by username. The label is the username, the value is that user's balance."
"Remember, the `Series` is indexed by username. The label is the username, the value is that user's cash balance."
]
},
{
Expand Down Expand Up @@ -95,9 +95,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The value is wrapped in a `NumPy.Scalar` so that it keeps it's data type and will play well with others.\n",
"The value is wrapped in a [`NumPy.Scalar`](https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.scalars.html) so that it keeps it's data type and will play well with other data types and NumPy data structures.\n",
"\n",
"The same positional indexing works just as it does with a standard list."
"The same positional indexing works just as it does with a standard list. The indices begin start with 0, and negative numbers can be used to access values from the end of the list."
]
},
{
Expand Down Expand Up @@ -156,6 +156,65 @@
"### `Series` behave like dictionaries"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"The label pasan has a value of 20.0"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"The label treasure has a value of 20.18"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"The label ashley has a value of 1.05"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"The label craig has a value of 42.42"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for label, value in balances.items():\n",
" render(\"The label {} has a value of {}\".format(label, value))"
]
},
{
"cell_type": "code",
"execution_count": 6,
Expand Down Expand Up @@ -259,9 +318,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Accessing More Explicitly\n",
"## Accessing More Explicitly with `loc` and `iloc`\n",
"\n",
"We are using indexing which can *either* be a label *or* a positional index. This can get confusing. It's possible to be more explicit, [which yes wise Pythonista](https://www.python.org/dev/peps/pep-0020/), is always better than implicit.\n",
"So far we have used a label and a positional index to access the value. This can get confusing as to what is being used, a label or a position. Because of this ambiguity, it is possible to be more explicit, [which yes wise Pythonista](https://www.python.org/dev/peps/pep-0020/), is always better than implicit.\n",
"\n",
"A `Series` exposes a property named `loc` which can be used to explicitly lookup by label based indices only."
]
Expand Down Expand Up @@ -321,15 +380,15 @@
"## Accessing by Slice\n",
"Like a NumPy array, a `Series` also provides a way to use slices to get different portions of the data, returned as a `Series`. \n",
"\n",
"*NOTE*: Slicing with indices vs. labels behaves differently. The latter is inclusive."
"*WARNING*: Slicing with indices vs. labels behaves differently. The latter is inclusive."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Slicing by Positional Index\n",
"When using positional indices, the slice is exclusive..."
"When using positional indices, the slice is exclusive. The last item **is not** included."
]
},
{
Expand Down Expand Up @@ -362,7 +421,7 @@
"metadata": {},
"source": [
"### Slicing by Label\n",
"When using labels, the slice is inclusive..."
"When using labels, the slice is inclusive. The last item **is** included."
]
},
{
Expand Down
8 changes: 4 additions & 4 deletions s1n03-vectorization-and-broadcasting.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Series Vectorization and Broadcasting\n",
"\n",
"Just like NumPy, pandas offers powerful vectorized methods and leans on broadcasting.\n",
"Just like NumPy, pandas offers powerful vectorized methods. It also leans on broadcasting.\n",
"\n",
"Let's explore!"
]
Expand Down Expand Up @@ -75,7 +75,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"...it's important to remember to lean on vectorization and skip the loops altogether."
"...it's important to remember to lean on vectorization and skip the loops altogether. Vectorization is faster and as you can see, easier to read and write."
]
},
{
Expand Down Expand Up @@ -119,7 +119,7 @@
"metadata": {},
"source": [
"### Broadcasting a Scalar\n",
"Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same opration."
"Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same operation."
]
},
{
Expand Down Expand Up @@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Using the `fill_value`\n",
"#### Using the `fill_value` parameter\n",
"It is possible to fill missing values so that everything aligns. The concept is to use the `add` method directly along with the the keyword argument `fill_value`."
]
},
Expand Down
2 changes: 1 addition & 1 deletion s1n04-creating-a-dataframe.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If your data is already in rows and columns you can just pass it along to the constructor. Labels and Column headings will be automatically generated as a range."
"If your data is already in rows and columns, like a list of lists, you can just pass it along to the constructor. Labels and Column headings will be automatically generated as a range."
]
},
{
Expand Down
8 changes: 5 additions & 3 deletions s1n05-accessing-a-dataframe.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"metadata": {},
"source": [
"# Accessing a DataFrame\n",
"There are many [different choices for indexing](https://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) DataFrames available.\n",
"There are many [different choices for indexing](https://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) DataFrames.\n",
"\n",
"Let's explore!"
]
Expand Down Expand Up @@ -36,7 +36,7 @@
"## Retrieve a specific Series\n",
"\n",
"### By Column Name\n",
"Each column is actually a `Series`. The `DataFrame` provides access to each of these `Series` by a column name index.\n",
"Each column in a `DataFrame` is actually a `Series`. The `DataFrame` provides access to each of these `Series` by a column name index.\n",
"\n",
"For instance, to get the **`balance`** `Series`, you could just use that for the index."
]
Expand Down Expand Up @@ -286,7 +286,9 @@
"source": [
"## Retrieve a Specific DataFrame Through Slicing\n",
"\n",
"Using the `loc` and `iloc` properties you can slice an existing `DataFrame` into a new one."
"Using the `loc` and `iloc` properties you can slice an existing `DataFrame` into a new one.\n",
"\n",
"In the example below we use `:` in the rows axis to select all rows, and we specify which columns we want back using a list in the columns axis, ala NumPy Fancy Indexing."
]
},
{
Expand Down
Loading

0 comments on commit 09f8e6e

Please sign in to comment.