forked from swaroopch/byte-of-python
/
12-problem-solving.pd
304 lines (207 loc) · 16.9 KB
/
12-problem-solving.pd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
# Problem Solving
We have explored various parts of the Python language and now we will take a look at how all these parts fit together, by designing and writing a program which *does* something useful. The idea is to learn how to write a Python script on your own.
## The Problem
The problem we want to solve is *"I want a program which creates a backup of all my important files"*.
Although, this is a simple problem, there is not enough information for us to get started with the solution. A little more **analysis** is required. For example, how do we specify *which* files are to be backed up? *How* are they stored? *Where* are they stored?
After analyzing the problem properly, we **design** our program. We make a list of things about how our program should work. In this case, I have created the following list on how *I* want it to work. If you do the design, you may not come up with the same kind of analysis since every person has their own way of doing things, so that is perfectly okay.
- The files and directories to be backed up are specified in a list.
- The backup must be stored in a main backup directory.
- The files are backed up into a zip file.
- The name of the zip archive is the current date and time.
- We use the standard `zip` command available by default in any standard Linux/Unix distribution. Windows users can [install](http://gnuwin32.sourceforge.net/downlinks/zip.php) from the [GnuWin32 project page](http://gnuwin32.sourceforge.net/packages/zip.htm) and add `C:\Program Files\GnuWin32\bin` to your system PATH environment variable, similar to [what we did for recognizing the python command itself](#dos-prompt). Note that you can use any archiving command you want as long as it has a command line
## The Solution
As the design of our program is now reasonably stable, we can write the code which is an **implementation** of our solution.
Save as `backup_ver1.py`:
~~~python
import os
import time
# 1. The files and directories to be backed up are specified in a list.
source = ['"C:\\My Documents"', 'C:\\Code']
# Notice we had to use double quotes inside the string for names with spaces in it.
# 2. The backup must be stored in a main backup directory
target_dir = 'E:\\Backup' # Remember to change this to what you will be using
# 3. The files are backed up into a zip file.
# 4. The name of the zip archive is the current date and time
target = target_dir + os.sep + time.strftime('%Y%m%d%H%M%S') + '.zip'
# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -qr {0} {1}".format(target, ' '.join(source))
# Run the backup
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')
~~~
Output:
~~~
$ python backup_ver1.py
Successful backup to E:\Backup\20080702185040.zip
~~~
Now, we are in the **testing** phase where we test that our program works properly. If it doesn't behave as expected, then we have to **debug** our program i.e. remove the *bugs* (errors) from the program.
If the above program does not work for you, put a `print(zip_command)` just before the `os.system` call and run the program. Now copy/paste the printed zip_command to the shell prompt and see if it runs properly on its own. If this command fails, check the zip command manual on what could be wrong. If this command succeeds, then check the Python program if it exactly matches the program written above.
How It Works:
You will notice how we have converted our *design* into *code* in a step-by-step manner.
We make use of the `os` and `time` modules by first importing them. Then, we specify the files and directories to be backed up in the `source` list. The target directory is where we store all the backup files and this is specified in the `target_dir` variable. The name of the zip archive that we are going to create is the current date and time which we generate using the `time.strftime()` function. It will also have the `.zip` extension and will be stored in the `target_dir` directory.
Notice the use of the `os.sep` variable - this gives the directory separator according to your operating system i.e. it will be `'/'` in Linux and Unix, it will be `'\\'` in Windows and `':'` in Mac OS. Using `os.sep` instead of these characters directly will make our program portable and work across all of these systems.
The `time.strftime()` function takes a specification such as the one we have used in the above program. The `%Y` specification will be replaced by the year with the century. The `%m` specification will be replaced by the month as a decimal number between `01` and `12` and so on. The complete list of such specifications can be found in the [Python Reference Manual](http://docs.python.org/py3k/library/time.html#time.strftime).
We create the name of the target zip file using the addition operator which *concatenates* the strings i.e. it joins the two strings together and returns a new one. Then, we create a string `zip_command` which contains the command that we are going to execute. You can check if this command works by running it in the shell (Linux terminal or DOS prompt).
The `zip` command that we are using has some options and parameters passed. The `-q` option is used to indicate that the zip command should work **q**uietly. The `-r` option specifies that the zip command should work **r**ecursively for directories i.e. it should include all the subdirectories and files. The two options are combined and specified in a shortcut as `-qr`. The options are followed by the name of the zip archive to create followed by the list of files and directories to backup. We convert the `source` list into a string using the `join` method of strings which we have already seen how to use.
Then, we finally *run* the command using the `os.system` function which runs the command as if it was run from the *system* i.e. in the shell - it returns `0` if the command was successfully, else it returns an error number.
Depending on the outcome of the command, we print the appropriate message that the backup has failed or succeeded.
That's it, we have created a script to take a backup of our important files!
Note to Windows Users
: Instead of double backslash escape sequences, you can also use raw strings. For example, use `'C:\\Documents'` or `r'C:\Documents'`. However, do **not** use `'C:\Documents'` since you end up using an unknown escape sequence `\D`.
Now that we have a working backup script, we can use it whenever we want to take a backup of the files. Linux/Unix users are advised to use the [executable method](#executable-python-programs) as discussed earlier so that they can run the backup script anytime anywhere. This is called the **operation** phase or the **deployment** phase of the software.
The above program works properly, but (usually) first programs do not work exactly as you expect. For example, there might be problems if you have not designed the program properly or if you have made a mistake when typing the code, etc. Appropriately, you will have to go back to the design phase or you will have to debug your program.
## Second Version
The first version of our script works. However, we can make some refinements to it so that it can work better on a daily basis. This is called the **maintenance** phase of the software.
One of the refinements I felt was useful is a better file-naming mechanism - using the *time* as the name of the file within a directory with the current *date* as a directory within the main backup directory. The first advantage is that your backups are stored in a hierarchical manner and therefore it is much easier to manage. The second advantage is that the filenames are much shorter. The third advantage is that separate directories will help you check if you have made a backup for each day since the directory would be created only if you have made a backup for that day.
Save as `backup_ver2.py`:
~~~python
import os
import time
# 1. The files and directories to be backed up are specified in a list.
source = ['"C:\\My Documents"', 'C:\\Code']
# Notice we had to use double quotes inside the string for names with spaces in it.
# 2. The backup must be stored in a main backup directory
target_dir = 'E:\\Backup' # Remember to change this to what you will be using
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory in the main directory
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive
now = time.strftime('%H%M%S')
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today) # make directory
print('Successfully created directory', today)
# The name of the zip file
target = today + os.sep + now + '.zip'
# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -qr {0} {1}".format(target, ' '.join(source))
# Run the backup
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')
~~~
Output:
~~~
$ python3 backup_ver2.py
Successfully created directory E:\Backup\20080702
Successful backup to E:\Backup\20080702\202311.zip
$ python3 backup_ver2.py
Successful backup to E:\Backup\20080702\202325.zip
~~~
How It Works:
Most of the program remains the same. The changes are that we check if there is a directory with the current day as its name inside the main backup directory using the `os.path.exists` function. If it doesn't exist, we create it using the `os.mkdir` function.
## Third Version
The second version works fine when I do many backups, but when there are lots of backups, I am finding it hard to differentiate what the backups were for! For example, I might have made some major changes to a program or presentation, then I want to associate what those changes are with the name of the zip archive. This can be easily achieved by attaching a user-supplied comment to the name of the zip archive.
Note
: The following program does not work, so do not be alarmed, please follow along because there's a lesson in here.
Save as `backup_ver3.py`:
~~~python
import os
import time
# 1. The files and directories to be backed up are specified in a list.
source = ['"C:\\My Documents"', 'C:\\Code']
# Notice we had to use double quotes inside the string for names with spaces in it.
# 2. The backup must be stored in a main backup directory
target_dir = 'E:\\Backup' # Remember to change this to what you will be using
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory in the main directory
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive
now = time.strftime('%H%M%S')
# Take a comment from the user to create the name of the zip file
comment = input('Enter a comment --> ')
if len(comment) == 0: # check if a comment was entered
target = today + os.sep + now + '.zip'
else:
target = today + os.sep + now + '_' +
comment.replace(' ', '_') + '.zip'
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today) # make directory
print('Successfully created directory', today)
# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -qr {0} {1}".format(target, ' '.join(source))
# Run the backup
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')
~~~
Output:
~~~
$ python3 backup_ver3.py
File "backup_ver3.py", line 25
target = today + os.sep + now + '_' +
^
SyntaxError: invalid syntax
~~~
How This (does not) Work:
**This program does not work!** Python says there is a syntax error which means that the script does not satisfy the structure that Python expects to see. When we observe the error given by Python, it also tells us the place where it detected the error as well. So we start *debugging* our program from that line.
On careful observation, we see that the single logical line has been split into two physical lines but we have not specified that these two physical lines belong together. Basically, Python has found the addition operator (`+`) without any operand in that logical line and hence it doesn't know how to continue. Remember that we can specify that the logical line continues in the next physical line by the use of a backslash at the end of the physical line. So, we make this correction to our program. This correction of the program when we find errors is called **bug fixing**.
## Fourth Version
Save as `backup_ver4.py`:
~~~python
import os
import time
# 1. The files and directories to be backed up are specified in a list.
source = ['"C:\\My Documents"', 'C:\\Code']
# Notice we had to use double quotes inside the string for names with spaces in it.
# 2. The backup must be stored in a main backup directory
target_dir = 'E:\\Backup' # Remember to change this to what you will be using
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory in the main directory
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive
now = time.strftime('%H%M%S')
# Take a comment from the user to create the name of the zip file
comment = input('Enter a comment --> ')
if len(comment) == 0: # check if a comment was entered
target = today + os.sep + now + '.zip'
else:
target = today + os.sep + now + '_' + \
comment.replace(' ', '_') + '.zip'
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today) # make directory
print('Successfully created directory', today)
# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -qr {0} {1}".format(target, ' '.join(source))
# Run the backup
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')
~~~
Output:
~~~
$ python3 backup_ver4.py
Enter a comment --> added new examples
Successful backup to E:\Backup\20080702\202836_added_new_examples.zip
$ python3 backup_ver4.py
Enter a comment -->
Successful backup to E:\Backup\20080702\202839.zip
~~~
How It Works:
This program now works! Let us go through the actual enhancements that we had made in version 3. We take in the user's comments using the `input` function and then check if the user actually entered something by finding out the length of the input using the `len` function. If the user has just pressed `enter` without entering anything (maybe it was just a routine backup or no special changes were made), then we proceed as we have done before.
However, if a comment was supplied, then this is attached to the name of the zip archive just before the `.zip` extension. Notice that we are replacing spaces in the comment with underscores - this is because managing filenames without spaces is much easier.
## More Refinements
The fourth version is a satisfactorily working script for most users, but there is always room for improvement. For example, you can include a *verbosity* level for the program where you can specify a `-v` option to make your program become more talkative.
Another possible enhancement would be to allow extra files and directories to be passed to the script at the command line. We can get these names from the `sys.argv` list and we can add them to our `source` list using the `extend`method provided by the `list` class.
The most important refinement would be to not use the `os.system` way of creating archives and instead using the `zipfile` or `tarfile` built-in module to create these archives. They are part of the standard library and available already for you to use without external dependencies on the zip program to be available on your computer.
However, I have been using the `os.system` way of creating a backup in the above examples purely for pedagogical purposes, so that the example is simple enough to be understood by everybody but real enough to be useful.
Can you try writing the fifth version that uses the [zipfile](http://docs.python.org/py3k/library/zipfile.html) module instead of the `os.system` call?
## The Software Development Process
We have now gone through the various **phases** in the process of writing a software. These phases can be summarised as follows:
#. What (Analysis)
#. How (Design)
#. Do It (Implementation)
#. Test (Testing and Debugging)
#. Use (Operation or Deployment)
#. Maintain (Refinement)
A recommended way of writing programs is the procedure we have followed in creating the backup script: Do the analysis and design. Start implementing with a simple version. Test and debug it. Use it to ensure that it works as expected. Now, add any features that you want and continue to repeat the Do It-Test-Use cycle as many times as required. Remember, **Software is grown, not built**.
## Summary
We have seen how to create our own Python programs/scripts and the various stages involved in writing such programs. You may find it useful to create your own program just like we did in this chapter so that you become comfortable with Python as well as problem-solving.
Next, we will discuss object-oriented programming.