New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Episoide 06-agg Add Definition of Aggregation #240

Open
nathan-hook opened this Issue May 7, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@nathan-hook

nathan-hook commented May 7, 2018

During my training checkout demo, I completely blanked on what an aggregation actually is.

Having a definition at the beginning of the lesson might be useful.

Here is an example definition:

"An aggregate function performs a calculation on a set of values, and returns a single value."

Taken from this page:
Aggregate Functions (Transact-SQL)

@remram44

This comment has been minimized.

Collaborator

remram44 commented May 7, 2018

At the top of the lesson, I can see:

  • Questions: How can I calculate sums, averages, and other summary values?
  • We now want to calculate ranges and averages for our data.
  • Each of these functions takes a set of records as input, and produces a single record as output

Are you sure such a definition should be put in, in addition to those?

@nathan-hook

This comment has been minimized.

nathan-hook commented May 7, 2018

I don't feel as though those are definitions. I feel as though those statements/questions get around the idea of what an aggregation is, but never actually define the word.

It would be like if we defined a database the following way:

  • Questions: How can I store relational data in a storage mechanism that allows that data to be related to one another?
  • We want to get values out of our data.
  • Each of these statements, SELECT, UPDATE, DELETE
@remram44

This comment has been minimized.

Collaborator

remram44 commented May 7, 2018

You'll notice that we didn't put a definition of database either, in fact it is very similar:

Three common options for storage are text files, spreadsheets, and databases. [...] Spreadsheets are good for doing simple analyses, but they don’t handle large or complex data sets well. Databases, however, include powerful tools for search and analysis, and can handle large, complex data sets.

@nathan-hook

This comment has been minimized.

nathan-hook commented May 8, 2018

I thought there was a pretty good definition of a database in the first lesson:
http://swcarpentry.github.io/sql-novice-survey/01-select/

A relational database is a way to store and manipulate information. Databases are arranged as tables. Each table has columns (also known as fields) that describe the data, and rows (also known as records) which contain the data.

And to be clear, I am not trying to be conflictual, I am just mentioning something that would have helped me out while I was teaching the aggregations section. :)

@remram44

This comment has been minimized.

Collaborator

remram44 commented May 8, 2018

No worries! I am very open to a PR rewriting this to make it more explicit, I'm just not sure that the information from your definition is missing. Indeed the current format is not as formal as an explicit definition; but is a formal definition required? I am just not sure.

fmichonneau pushed a commit that referenced this issue Jun 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment