New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to time database creation vs text file creation with Python? #229

Open
gcapes opened this Issue Apr 13, 2018 · 6 comments

Comments

Projects
None yet
5 participants
@gcapes
Contributor

gcapes commented Apr 13, 2018

In episode 10, exercise Filling a Table vs. Printing Values, the challenge asks:

Write a Python program that creates a new database in a file called original.db containing a single table called Pressure, with a single field called reading, and inserts 100,000 random numbers between 10.0 and 25.0. How long does it take this program to run? How long does it take to run a program that simply writes those random numbers to a file?

No instructions are given for how to time the execution of each program, which takes longer, and why this is of interest.

@gcapes

This comment has been minimized.

Contributor

gcapes commented Apr 13, 2018

Not sure if this wants linking to #178?

@gcapes

This comment has been minimized.

Contributor

gcapes commented Apr 13, 2018

The same problem exists for the following exercise: Filtering in SQL vs. Filtering in Python

@remram44

This comment has been minimized.

Collaborator

remram44 commented Apr 16, 2018

This is not really a Python course, so I'm not sure if explaining how to use time or timeit is in scope?

@remram44 remram44 added the discussion label Apr 16, 2018

@SamHames

This comment has been minimized.

Collaborator

SamHames commented Apr 17, 2018

I agree with @remram44 that it does seem beyond the scope of the SQL lessons to look at Python timing commands.

I don't really know what the insert question is intending to demonstrate -- one thing it could point towards is the use of transactions for atomicity and speed (and the transaction management the Python adapter exposes), but that also seems beyond the scope of this course.

For the filtering case I think the intent of the timing is to demonstrate that it's generally faster and simpler to read exactly what you need from the database, rather than loading everything into memory and operating on it there.

@gcapes

This comment has been minimized.

Contributor

gcapes commented Apr 17, 2018

Agreed that the point isn't to teach python (so there's no need to explain python timing commands in the question) but if there is a point to be made about speed of operations in SQL vs python, then surely the solution should contain some commands to print out the execution time for each?

@spriggsy83

This comment has been minimized.

spriggsy83 commented May 18, 2018

As students are encouraged to use a Unix shell for completing this set of lessons, this question could simply encourage them to use the shells built in 'time' command? E.g.:
time python myRandomTableMaker.py
time python myRandomTextMaker.py

If the point of the exercise is to experiment and think about which was faster, I would also encourage them to repeat for different sized tables. I.e. what's the time difference if inserting/writing just 100 rows, or 1,000 rows, or 500,000 rows, or 1 million rows, etc..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment