Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Promoted SQL's set-based advantages #268
SQL's set-based syntax is different for those who have only been exposed to iterative programming languages, particularly those reading and writing to delimited text files. SQL creates more concise, self-describing ways to model, relate, and aggregate data.
You’re absolutely right, and I think the subtlety of a set as unordered conveys. SQL illustrates set-based design because you provide no basis for the order in which rows are searched, filtered, and joined – only specifying the order of end results through the ORDER BY clause (in a pure, no-hints query like 99% of all queries where you don’t tip off the execution plan engine – definitely not a fruitful 1% topic for digression in Carpentries...!) https://www.itprotoday.com/analytics-reporting/programming-sql-set-based-way The referenced article provides some great examples. Let me know if this makes more sense as a way to think about things in sets and not in iterative/programmatic loops. Thanks! John Wright Manager, IT Clinical & Research Architecture 207-288-6504 t | 207-691-3433 m email@example.com<mailto:firstname.lastname@example.org> The Jackson Laboratory Bar Harbor, ME | Farmington, CT | Sacramento, CA www.jax.org<http://www.jax.org/> The Jackson Laboratory: Leading the search for tomorrow's cures From: Remi Rampin [mailto:email@example.com] Sent: Friday, October 26, 2018 1:55 PM To: swcarpentry/sql-novice-survey <firstname.lastname@example.org> Cc: John Wright <John.Wright@jax.org>; Mention <email@example.com> Subject: Re: [swcarpentry/sql-novice-survey] Promoted SQL's set-based advantages (#268) @remram44 requested changes on this pull request. Hi @johnthomaswright<https://github.com/johnthomaswright>, thanks for the contribution! I'm with you, I think it makes sense to make it clear what SQL is about. However I am not sure about "set-based language".
________________________________ In _episodes/01-select.md<#268 (comment)>:
@@ -12,6 +12,7 @@ keypoints:
- "A relational database stores information in tables, each of which has a fixed set of columns and a variable number of records." - "A database manager is a program that manipulates information stored in a database." - "We write queries in a specialized language called SQL to extract information from databases." +- "SQL is a set-based language, where we specify what data and format to return or save; but not how to save it or how to retrieve it (like you would specify in iterative loops in other languages.) Maybe "query-based" language instead? "set" usually refers to unordered collections. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#268 (review)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AksQtgn7xu8rhpdWmzG8U89nc6DbR4VEks5uo0yMgaJpZM4Xz9gP>. --- The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.
Very good feedback. The reason for this pull request has to do with my lecturing experience at U. of Florida from years ago and teaching interns and employees who didn’t take database classes. Two common (anti-)patterns I saw were writing inline queries to grab a single row in joins, which essentially created a fetch loop in the DBMS and destroyed performance, and grabbing individual rows one at a time in a parameterized loop. If we instead think about how to specify each set and how to join those sets, we’re maximizing the utility and differentiation of SQL when compared even to iterators and lambda/anonymous functions. This also helps people appreciate how to avoid, or expect, having multiplicity when more than 1 row matches on a join. That’s where I saw procedural fallbacks to forced iteration, poor use of cursors, and extremely messy inline queries. Maybe some compare/contrast examples would be more time-effective and practical than trying to acknowledge sets and relational algebra principles?