New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promoted SQL's set-based advantages #268
Promoted SQL's set-based advantages #268
Conversation
SQL's set-based syntax is different for those who have only been exposed to iterative programming languages, particularly those reading and writing to delimited text files. SQL creates more concise, self-describing ways to model, relate, and aggregate data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @johnthomaswright, thanks for the contribution! I'm with you, I think it makes sense to make it clear what SQL is about. However I am not sure about "set-based language".
@@ -12,6 +12,7 @@ keypoints: | |||
- "A relational database stores information in tables, each of which has a fixed set of columns and a variable number of records." | |||
- "A database manager is a program that manipulates information stored in a database." | |||
- "We write queries in a specialized language called SQL to extract information from databases." | |||
- "SQL is a set-based language, where we specify what data and format to return or save; but not how to save it or how to retrieve it (like you would specify in iterative loops in other languages.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "query-based" language instead? "set" usually refers to unordered collections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that a relation is defined to be a set of tuples, "set based language" is accurate.
Queries are surely a central part of SQL, but there's also data manipulation, as well as data definition. Sets (or relations) are the the unifying theme, as all defining, manipulating and querying pertains to relations at the end of the day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also recommend using query rather than set
You’re absolutely right, and I think the subtlety of a set as unordered conveys. SQL illustrates set-based design because you provide no basis for the order in which rows are searched, filtered, and joined – only specifying the order of end results through the ORDER BY clause (in a pure, no-hints query like 99% of all queries where you don’t tip off the execution plan engine – definitely not a fruitful 1% topic for digression in Carpentries...!)
https://www.itprotoday.com/analytics-reporting/programming-sql-set-based-way
The referenced article provides some great examples. Let me know if this makes more sense as a way to think about things in sets and not in iterative/programmatic loops. Thanks!
John Wright
Manager, IT Clinical & Research Architecture
207-288-6504 t | 207-691-3433 m
john.wright@jax.org<mailto:john.wright@jax.org>
The Jackson Laboratory
Bar Harbor, ME | Farmington, CT | Sacramento, CA
www.jax.org<http://www.jax.org/>
The Jackson Laboratory: Leading the search for tomorrow's cures
From: Remi Rampin [mailto:notifications@github.com]
Sent: Friday, October 26, 2018 1:55 PM
To: swcarpentry/sql-novice-survey <sql-novice-survey@noreply.github.com>
Cc: John Wright <John.Wright@jax.org>; Mention <mention@noreply.github.com>
Subject: Re: [swcarpentry/sql-novice-survey] Promoted SQL's set-based advantages (#268)
@remram44 requested changes on this pull request.
Hi @johnthomaswright<https://github.com/johnthomaswright>, thanks for the contribution! I'm with you, I think it makes sense to make it clear what SQL is about. However I am not sure about "set-based language".
________________________________
In _episodes/01-select.md<#268 (comment)>:
@@ -12,6 +12,7 @@ keypoints:
- "A relational database stores information in tables, each of which has a fixed set of columns and a variable number of records."
- "A database manager is a program that manipulates information stored in a database."
- "We write queries in a specialized language called SQL to extract information from databases."
+- "SQL is a set-based language, where we specify what data and format to return or save; but not how to save it or how to retrieve it (like you would specify in iterative loops in other languages.)
Maybe "query-based" language instead? "set" usually refers to unordered collections.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#268 (review)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AksQtgn7xu8rhpdWmzG8U89nc6DbR4VEks5uo0yMgaJpZM4Xz9gP>.
---
The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.
|
How about something like |
Very good feedback. The reason for this pull request has to do with my lecturing experience at U. of Florida from years ago and teaching interns and employees who didn’t take database classes. Two common (anti-)patterns I saw were writing inline queries to grab a single row in joins, which essentially created a fetch loop in the DBMS and destroyed performance, and grabbing individual rows one at a time in a parameterized loop.
If we instead think about how to specify each set and how to join those sets, we’re maximizing the utility and differentiation of SQL when compared even to iterators and lambda/anonymous functions. This also helps people appreciate how to avoid, or expect, having multiplicity when more than 1 row matches on a join. That’s where I saw procedural fallbacks to forced iteration, poor use of cursors, and extremely messy inline queries.
Maybe some compare/contrast examples would be more time-effective and practical than trying to acknowledge sets and relational algebra principles?
|
util.py: make functions return NotImplemented
Thank you for your contribution. This lesson has migrated to use The Carpentries Workbench, but unfortunately, due to various factors, the Maintainers of this lesson were unable to address this pull request before the transition. Because of this, we had to close your pull request. Please note that this does not mean that your contribution was not valued. There are many reasons why a pull request is not merged. It's important to remember that the Maintainers are first and foremost people---people who maintain this lesson on a voluntary basis. Sometimes pull requests become stale because other responsibilities take precedence. Thank you for taking the time to open the pull request in the first place. If you wish to contribute again, you will need to delete and re-fork your repository. How to contributeIf you wish to contribute, you will need to use the following steps to delete,
QuestionsIf you have any questions or would like assistance, please contact @core-team-curriculum (email: curriculum@carpentries.org) or you can respond to this message. |
SQL's set-based syntax is different for those who have only been exposed to iterative programming languages, particularly those reading and writing to delimited text files. SQL creates more concise, self-describing ways to model, relate, and aggregate data.