# Problem Solving by Progamming
*aka Performing Tasks with Computers*

One of the great satisfactions with computer programming is the ability to solve real-world tasks - especially when others can use our creations.

One approach to creating programs is to apply procedural programming.  In this paradigm, programmers create a list of instructions that computers must perform in order to solve a particular problem or perform a specific task.  For right now, we just consider small programs - as the course progresses, our programs will become larger and we will adopt more techniques to approach these problems.

In the [Introductory Notebook](01-Introduction.ipynb), the first task was to open a web page from the Wayback Machine.  Suppose instead that you were asked to find the homepage from [The News&Observer](https://www.newsobserver.com/) from November 11th, 2018.  One way to begin to a create an algorithm for this process is to utilize "The Seven Steps" [1] (The following description is inspired by "All of Programming" by Drew Hilton and Anne Bracy.)

![](images/sevensteps.png)



Step One is to manually work through one instance of the problem. Think through the the various actions that need to occur and the order of those actions.

One way, we could work the instance ourselves is to visit https://archive.org/.  Once there, we could enter https://www.newsobserver.com/ as the URL to search in the Wayback Machine. This produces a timeline and calendar from which we can select a particular page.  So we click on 2018.  Then we click on November 11th in the calendar and select one of the dates.  Success!  The page refreshes with this URL: https://web.archive.org/web/20181111065826/https://www.newsobserver.com/

Step Two is to write these steps down.  For this step and Step Three, write the steps in "pseudocode".  This is an informal, natural language (i.e., English, Chinese, etc.) description of how the program will work.  Don't worry if you're not sure how this really works yet, the course has plenty of examples.  

<pre>
1. Visit https://archive.org/
2. Type https://www.newsobserver.com into the Wayback Machine and press "Enter" 
3. Select 2018 from the timeline
4. Select November 11th and one of the dates from calendar
5. Wait for the browser to respond with the archived page
</pre>

As you write these steps down, make sure you explicitly state everything.  Are there implicit steps/instructions missing? A computer can not infer steps that our mind implicitly places into steps.  For example, in the above steps, how do I visit that particular website (open up a browser window and enter the url into the address/search bar). Many computer programming books like to use the examples of recipes or knitting patterns to demonstrate the sequences of steps to be followed.

Step Three involves generalizing our steps.  Our initial solution works for a specific site and date, but also requires manual intervention.  We need to research ways to programmatically access this way. Often, sites provide an [application programming interface (API)](https://en.wikipedia.org/wiki/API#Web_APIs) as way to access data. Perform web search on "wayback machine api" leads to this page: https://archive.org/help/wayback_api.php. In reading this page, we see that the Wayback Machine has an API endpoint with URLs that following this example: https://archive.org/wayback/available?url=example.com&timestamp=20060101

In other situations, you will need to research the problem domain to gain a deeper understand of what's occurring and how you can automate something.

Cool. Now, we know how to programmatically access the wayback machine, but we still don't have a general algorithm.  We have something that works for a specific problem, but we want to find something that works for all instances of this problem - which means any site and any date.  The purpose of this step is step examine why you performed your steps, what patterns may exist in the possible solution, and how various inputs can be handled.

To generalize this solution, we need to pass the url and date as inputs into the process.  For right now, we can just have the user enter input these two values.  

So let's look at the generalized set of steps:
<pre>
1. Have the user enter a website URL (website_url)
2. Have the user enter a date in the format of YYYYMMDD (search_date)
3. Create a url of  https://archive.org/wayback/available?url=website_url&amp;timestamp=searchdate
4. Open a connection to that URL
5. Read the results of the URL
6. From the results, get the URL of the closest page.
7. Open a browser with the closest URL.
</pre>

### In Step Four, we then manually perform the steps to ensure we have a correct, working algorithm.  This example is relatively straightforward, but in more complex situations we'll want to have at least some assurance here that our approach is sound. For all but the simplest situations, it is impossible to determine how many test cases are required: tests can demonstrate that defects exist, but they can not prove that defects exists.  Such proof requires formal verification using models and logic that can be reasoned upon.  Hardware manufacturers are probably the leading use of formal verification in industry, but widespread use among software developers is practically nonexistent. As we discuss testing in later notebooks,  we will look at different ways to quantify the appropriate amount of testing required. 

Side Note: One of the fundamental questions within Computer Science has been whether or not it can be proved that for an arbitrary program will it halt for a given input.  Known as the [Halting Problem](https://en.wikipedia.org/wiki/Halting_problem), this problem has been used a the basis to show that problems are equivalent to it, and hence unsolvable.  Proving the absence of all defects is one such problem.

The goal of these first four steps is to have performed sufficient planning to minimize and eliminate as we then convert our steps into computer code and test that code.  Often, you will here this referred to as "design".  Our "requirement" was to develop a routine to search the Wayback Machine for a particular URL and date, and then to open a web browser with the closest URL available in the archive.  

From a documentation perspective, the expected outcome should contain:
* Inputs to the routine
* Outputs from the routine 
* Pseudocode (steps in the routine)

This "outcome" is basis of an *algorithm*, which is a well-defined procedure (series of steps) that takes some input and produces some value or set of values (output).[2]

Pseudocode Guidelines: [3]
* Use natural language statements that precisely describe specific operations
* Avoid syntactic elements from the target programming language
* Write pseudocode at the level of intent - document the approach to solve the problem
* Write pseudocode at a low enough level that generating code from it is straightforward

In Step 5, we convert the pseudocode into the code of our target language. 

In Step 6, we then run and test the program to see if it works.  If all of our tests pass, we are complete.  If we find issues, then we need to debug our program (Step 7) to find and correct those issues, which we'll fix be going back to Step 5.

As you increase your coding skills, much of this process will become second nature and you can minimize/combine steps. However you should always spend an appropriate amount of time planning out your solution.  For easier problems, this can just be a few scribbles or notes.  For more complex problems and large systems, you will spend a substantial amount of time performing research to figure out the best approach and then creating documentation for that approach.

## References
[1] Andrew D. Hilton, Genevieve M. Lipp, and Susan H. Rodger. 2019. _Translation from Problem to Code in Seven Steps_. In Proceedings of the ACM Conference on Global Computing Education (CompEd '19). Association for Computing Machinery, New York, NY, USA, 78–84. https://doi-org.prox.lib.ncsu.edu/10.1145/3300115.3309508

[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. _Introduction to Algorithms_, Third Edition (3rd. ed.). The MIT Press.

[3] Steve McConnell. 2004. _Code Complete_, Second Edition. Microsoft Press, USA.