Skip to content

01-starting-with-data: organizing statistics in a Python table#280

Closed
ErinBecker wants to merge 4 commits intodatacarpentry:gh-pagesfrom
gustavothomas17:gh-pages
Closed

01-starting-with-data: organizing statistics in a Python table#280
ErinBecker wants to merge 4 commits intodatacarpentry:gh-pagesfrom
gustavothomas17:gh-pages

Conversation

@ErinBecker
Copy link
Copy Markdown
Contributor

Submitted on behalf of @gustavothomas17.

…Frame"

The name of the section starts with "Calculating Statistics" but there were actually no statistics in the section. I added a general command with the function `.describe()` and added a challenge exercise at the end of this section related to it.
Updating section "Calculating Statistics…"
Copy link
Copy Markdown
Contributor

@maxim-belkin maxim-belkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ErinBecker and @gustavothomas17. Please see my comments.

> species are in the data?
>
> 2. What is the difference between `len(site_names)` and `surveys_df['site_id'].nunique()`?
{: .challenge}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you delete this line?

Pandas' `describe` function will only return summary values for columns
containing numeric data (note that for the columns `species_id` and `sex`
nothing was returned). Further exploring our data set, we might want
to know how many animals were collected in each plot, or how many of each.
Copy link
Copy Markdown
Contributor

@maxim-belkin maxim-belkin May 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole new section partially repeats the text that follows.
Please either remove the duplicate text here or in the paragraphs that follows.
Also, please use the following syntax for code blocks

~~~
python code
~~~
{: .language-python}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cloned the repository to my PC, made the changes you requested and pushed it back @maxim-belkin, I hope to have done it correctly, if not please let me know where I did wrong and I will fix it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gustavothomas17 apologies for the delay...

You used your gh-pages branch which complicates things a little bit. I will close this PR, but please submit a new one with your suggested changes using a different branch. Here is what you should do:

git clone https://github.com/datacarpentry/python-ecology-lesson new-directory
cd new-directory
git branch my-changes
git checkout my-changes
## implement your changes
git add -u
git commit
git push --force https://github.com/gustavothomas17/python-ecology-lesson gh-pages:gh-pages
git push https://github.com/gustavothomas17/python-ecology-lesson

and then submit this PR again.

I will add a few comments to your PR - please take them into account when submitting a new request.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your time @maxim-belkin, I hope to have followed your instructions properly this time.

@maxim-belkin maxim-belkin changed the title organizing statistics in a python table 01-starting-with-data: organizing statistics in a Python table May 9, 2018
Copy link
Copy Markdown
Contributor

@maxim-belkin maxim-belkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments.
I am closing this PR, but please submit a new one following the steps I described.

@@ -1,3 +1,4 @@

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't add this empty line


> ## Challenge - Statistics
>
> 1. What type of summary statistics is given by `surveys_df['sex'].describe()`?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, you are not removing the exercise that was originally here (line 385 and below). If you are proposing to replace the existing exercise, you should delete it.


We can calculate basic statistics for all records in a single column using the
syntax below:
We often want to calculate the summary statistics grouped by subsets or
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original text is not ideal either but... say something like:

we calculate statistics and display its summary...

```

We can also extract one specific metric if we wish:
We can also extract specific metrics if we wish:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quantities of interest to us

maxim-belkin pushed a commit that referenced this pull request Jun 19, 2018
Rename CoC file to align with GitHub's Community Profile expectation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants