# DS102 Statistical Programming in R : Lesson One - Thinking Like a Programmer

## Thinking Like a Programmer

In [1]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Lesson One
VimeoVideo('234929904', width=720, height=480)

### Table of Contents <a class="anchor" id="DS102L1_toc"></a>

* [Table of Contents](#DS102L1_toc)
    * [Page 1 - Welcome](#DS102L1_page_1)
    * [Page 2 - Think like a Data Scientist](#DS102L1_page_2)
    * [Page 3 - Conversion](#DS102L1_page_3)
    * [Page 4 - Peanut Butter and Jelly Time](#DS102L1_page_4)
    * [Page 5 - PBJ Instructions Check](#DS102L1_page_5)
    * [Page 6 - Breaking Down Problems](#DS102L1_page_6)
    * [Page 7 - Breaking Down Problems Review](#DS102L1_page_7)
    * [Page 8 - Mistakes are INEVITABLE](#DS102L1_page_8)
    * [Page 9 - Imposter Syndrome](#DS102L1_page_9)
    * [Page 10 - Key Terms](#DS102L1_page_10)
    * [Page 11 - Lesson 1 Hands-On](#DS102L1_page_11)
    

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 1 - Welcome <a class="anchor" id="DS102L1_page_1"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

## Welcome

Congratulations! You are on the brink of learning your very first programming language! You will be learning a lot as you go through the lessons. It will be hard work, but know that you can do it! Follow along and when you get stuck, keep in mind that there are several great public resources available to you like **<a href="https://stackoverflow.com/" target="_blank">StackOverflow</a>**, **<a href="https://www.w3schools.com/" target="_blank">W3Schools</a>**, or a crowd favorite, **<a href="https://www.google.com" target="_blank">Google</a>**.  

> _"Your first projects aren't the greatest things in the world, and they may have no money value, they may go nowhere, but that is how you learn - you put so much effort into making something right if it is for yourself."_
>
> **- Steve Wozniak**

In this lesson, you will learn about: 

* Breaking problems down into simple steps
* Providing instructions in simple steps
* The value of mistakes
* Imposter syndrome

This lesson will culminate in a hands on that will require you think like a programmer to solve the logical puzzle.

In [2]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Problem Solving and Critical Thinking
VimeoVideo('407763686', width=720, height=480)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 2 - Think like a Data Scientist<a class="anchor" id="DS102L1_page_2"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [3]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Thinking like a data scientist
VimeoVideo('325919813', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L01-pg2tutorial.zip)**.

The biggest issue new students run into is how to adjust their thinking to a style that is necessary for data science. Even as computers become more advanced, they continue to be very good at doing exactly what you tell them to do &mdash; even if it makes no sense at all. This means that you must become very good at saying precisely what you want the computer to do. This seems easy enough at first &ndash; after all, you communicate with other people all the time &ndash; but it is notoriously difficult to get right early on. Other people are capable of understanding your intent, adding context, and asking for clarification when they do not understand. Computers do none of this. At best, computers give an error when they do not understand and at worst they will carry out an instruction that you did not intend. This leads to problems with your program, crashes, or even data loss. The process of learning to think like a programmer is part learning to be very specific and part learning how to deal with seemingly impossibly-large problems. You'll go into both of these throughout this lesson.

---

## How To Eat An Elephant

There's an old joke, `"How do you eat an elephant?"` which seems absurd at first. Eating an elephant is not something that any human is going to be able to do. The punchline: `one bite at a time` while corny, does have an element of truth to it. Given enough time and small enough bites, you probably _could_ eat an elephant. So it is with large problems, both in programming and elsewhere. Doing a large report, cooking a three course meal for 20 guests, and using machine learning are all impossible to do in one step. They must be broken down into smaller pieces until you are left with a pile of achievable tasks.

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 3 - Conversion<a class="anchor" id="DS102L1_page_3"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [4]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Conversion
VimeoVideo('325919753', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L01-pg3tutorial.zip)**.

# Conversion

One example of the kind of thinking that programming and data science requires is conversions. For example, if you have 87 inches of ribbon, and you want to wrap it around a gift box that measures 4 feet around, would you have enough ribbon?

The solution requires a few steps:

1. Determine how many inches there are in a foot.
2. Multiply the box circumference (in feet) by the number of inches per foot to get the total number of inches of ribbon needed to wrap the box.
3. Determine whether the result of #2 is less than or equal to the number of inches of available ribbon.

Here's how you'd think this through:

```text
1. 12 inches per foot

2. 4 * 12 = 48 inches of ribbon needed

3. 48 <= 87
```

So yes, you would have plenty of ribbon. You could almost wrap around twice!

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 4 - Peanut Butter and Jelly Time<a class="anchor" id="DS102L1_page_4"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


```c-lms
topic: PBJ Time
```

# Peanut Butter and Jelly Time

An important aspect of thinking like a programmer is to be precise when giving instructions. This is a tricky thing for most people since it is so unlike what you typically do in everyday life.

A classic exercise in learning this skill is to write out a list of step-by-step instructions for making a peanut butter and jelly sandwich.

Go ahead and do this now. When you're done, move on to the next section.

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 5 - PBJ Instructions Check<a class="anchor" id="DS102L1_page_5"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


```c-lms
topic: PBJ Instructions
```

# PBJ Instructions Check

Do your instructions look something like this?

* Put peanut butter on one slice of bread
* Put jelly on a different slice of bread
* Place the two slices on top of each other with the peanut butter facing the jelly

Perhaps your directions are more detailed. When working with a computer it is like dealing with someone who takes what you say literally. They will do exactly what you tell them. Here are some video examples of people performing the actions others have written out for dramatic effect:

**<a href="https://www.youtube.com/watch?v=RjHzD2sfWcQ" target="_blank">Could you describe how to make a P&J sandwich to an alien??</a>**

**<a href="https://www.youtube.com/watch?v=cDA3_5982h8" target="_blank">Exact Instructions Challenge</a>**

Do you feel you could write a list of instructions that could hold up to that kind of treatment? What sorts of things would you need to include and what (if anything) could be assumed?

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 6 - Breaking Down Problems<a class="anchor" id="DS102L1_page_6"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [5]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Breaking Down Problems
VimeoVideo('325919744', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L01-pg6tutorial.zip)**.

# Breaking Down Problems

There is one more type of challenge you will look at in this lesson; deconstructing word problems. This type of problem is the most challenging for many people. Word problems, while potentially intimidating, closely resemble real-world data science programming tasks. It is very unlikely that you will be handed a well-written specification from a client, instead of a verbal or text based communication to express the desired solution. The actual implementation of that solution requires you to pick apart what you have been given, to connect what is being asked for with a set of tools you have at your disposal, and then formulate a plan for solving the problem.

Consider the problem statement:

```text
I have a list of scores from a survey sent out to our clients. I want to know how well we did across all the scores as a single number.
```

What data is present in this sentence? What kind of data is it?

**After you've come up with an answer, hover your mouse pointer here: A list of scores (numbers)**


Once you know the basic elements that are available, you can determine what to do with them. Should you add all the numbers together? Should you pick the largest or the smallest number?

The problem is asking for a single number that represents the values of all the individual scores. None of the suggestions above would accomplish this.

If you have several very high numbers and a few low numbers, you would expect the answer to be relatively high. If you had several low numbers and one single high number, you might expect the answer to be higher than your low numbers but well below the single high number.

So, what kind of operation is being asked for here?

After you've come up with an answer, hover your mouse pointer here: The mean (average) of all the scores.

What's important here is identifying the question. In this case, it required an understanding of averages and recognizing it as the question. If you did not know what an average was ahead of time, it would be difficult to solve that particular problem.

Learning new programming techniques gives you more tools to apply to any given problem. If a problem seems impossible, it is essential to go through the tools that you have and brush up on less familiar ones. Discovering the right tool is often the hardest part of solving a problem.


---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 7 - Breaking Down Problems Review<a class="anchor" id="DS102L1_page_7"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">



```c-lms
topic: Breaking Down Problems Review
```

# Breaking Down Problems Review

The skills of deconstructing problems and then precisely communicating your instructions will take time to hone. Fortunately, you will have plenty of chances to practice these skills. Try to remember, the process of breaking down the problem and identifying what you want to do next is often the trickiest part. Writing out your list of `what do I have to work with` and `what steps are needed to accomplish my goal` are valuable skills at any level of experience or technical competency. It's no coincidence that whiteboards and sticky notes are widespread in areas where data scientists are working.

## Review

Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [6]:
try:
    from DS_Students import MultipleChoice
    from ipynb.fs.full.DS102Questions import *
except:
    !pip install DS_Students
    from DS_Students import MultipleChoice
    from ipynb.fs.full.DS102Questions import *

In [7]:
try:
    display(L1P7Q1, L1P7Q2)
except:
    pass

VBox(children=(Output(), RadioButtons(layout=Layout(width='max-content'), options=(('A. True', 0), ('B. False'…

VBox(children=(Output(), RadioButtons(layout=Layout(width='max-content'), options=(('A. Think like a computer'…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 8 - Mistakes are INEVITABLE<a class="anchor" id="DS102L1_page_8"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [8]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Mistakes are Inevitable
VimeoVideo('325919805', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L01-pg8tutorial.zip)**.

# Mistakes are INEVITABLE

> _“Experience is the name everyone gives to their mistakes.”_ – Oscar Wilde

In programming there is no truer statement.  You cannot get better at programming without making a lot of mistakes along the way.  In fact, a very common truism in programming is: 

>_"If something works the first time you run it, be very afraid._”

Mistakes come in all types.  Misspell a word here, use the wrong character there, forget a semi-colon or curly brace there, and all of a sudden you’ll see red on your screen. This should not cause fear.  In fact, errors are your friend.  They’re the greatest teacher and best pathway to learning.  In order to understand what you need to do, you often first need to know what _NOT_ to do.

See if you can figure out what’s wrong with the following sentence:

```text
My friend Ron is from The United Kingdom and he loves French cuisine.
```

Did you spot the error? Great! There are, in fact, _more than one_ problem with the sentence above. Did you find them, or did you stop once you discovered the first one?

A common issue beginning developers have is that once they encounter an error with their code, they tend to stop paying attention to the rest of it.  They can become fixated on the one problem and not realize another problem might also be lurking.  After they fix the first bug, they will see another error message due to the second problem, but believe the error message is still caused by the first error.  It can take days to realize they fixed the original issue all along.

Even experienced developers can spend days trying to fix an error in their code.  Since a computer interprets the language literally, you must be perfect in how you articulate the instructions &mdash; you saw this with the peanut butter and jelly sandwich example.  The instructions might be very clear to you as you write them, but when taken literally they don’t make sense.

Looking back at the sentence about Ron, despite having two errors, were you still able to understand the meaning?  More than likely you were; however, a computer doesn’t think the same as you do.  It takes a very keen eye and intricate understanding of the language to know that the `T` in the word `the`, when used as an adjective before a proper noun, should be lowercase. That’s the first error. The second error is that there should be a comma before the conjunction `and` to separate the two independent clauses. Miss a comma between arguments when coding, and your computer will definitely be unhappy!

---

## Logical Errors

There is something else interesting about the sentence above: some grammar-correction software may not detect the errors. As far as the grammar-correction software can tell, `The` might be part of the proper noun `The United Kingdom`.  It has no way of knowing your intention is to use `the` as an adjective to `United Kingdom`.  A similar issue is happening with the missing comma.  Without understanding the intent of the sentence the grammar-correction software will not know that it is two conjoined sentences.  It can only interpret the words in their literal form.

Both of these errors are what would be called a `logical error` in programming.  The sentence is grammatically correct as far as the grammar-correction software is concerned, but there are logical errors that cannot be determined by the computer since it can only understand the text literally.  The rules of capitalization and punctuation often require an understanding of the intention of the phrase trying to be conveyed, and this is not something a computer can determine easily.  The same is true with programming.  The computer cannot determine the intent of your code, only the syntax you used to write it.

Even if you had a keen eye for both problems, consider how often you make grammatical errors in your own writing.  Now imagine that every grammatical error you ever made was taken literally by a computer.  Would the computer comprehend your instructions?  How often would your intention be misinterpreted by the computer? Could your instructions be performed exactly as you wrote them but have unintended results?

You will make many mistakes along the way.  Do not fear them, embrace them.  They will help you grow as a programmer.  The more mistakes you make, the faster you will grow.  As you explore R and Python, it is advised that you even take a little time to break your code after a successful attempt, to learn more about what works, what doesn't and why.

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 9 - Imposter Syndrome <a class="anchor" id="DS102L1_page_9"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [9]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Imposter Syndrome
VimeoVideo('325919763', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L01-pg9tutorial.zip)**.

# Imposter Syndrome

One of the most commonly-expressed worries beginning data scientists have is feeling not good enough to be labeled a `data scientist`.  It is sometimes hard to see your own successes when compared to others.  You may feel like a fraud, or a fake.  There is a term for this feeling; it’s called the *Imposter Syndrome*.

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Check out this article about Imposter Syndrome in Data Science!</h3>
    </div>
    <div class="panel-body">
        <p>If you would like to read more about Imposter Syndrome specifically for data science, and tips to overcome it, check out <a href="https://caitlinhudon.com/2018/01/19/imposter-syndrome-in-data-science/" target="_blank">this blog</a>.</p>
    </div>
</div>

Due to the vast scale of information that a data scientist must comprehend, it's easy to understand why this feeling exists.  However, EVERYONE is facing the same challenges - remember that data science is only a few years old!

Much like learning a verbal language, a person can spend years learning it and still only barely pass in a community of native speakers. In fact, learning how to program is a lot like learning a new language, because it **IS** a new language!

Your brain has to be trained how to think like a programmer using a programming language.  It takes years to perfect, and it can be intimidating when there are peers who seemingly understand the language better.

---

## Hard To See Success

It is easy to ignore your own success and attribute it to luck or deception.  The result is that the feeling may not go away even after proving your ability to perform the tasks at hand. And as a junior data scientist, there will be several situations where you will feel in over your head.  It’s a significant part of learning.

> _"Every great developer you know got there by solving problems they were unqualified to solve until they actually did it."_ - Patrick McKenzie

Questioning your skills and abilities is one of the driving forces behind expanding your learning.  Great developers embrace the struggle of not knowing the solution and use it as a means to further their academic understanding. Trying to learn new techniques and become better skilled at a craft is a continuous struggle.  Perfection is not obtainable, but improvement is.

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>You may want to watch this <a href="https://vimeo.com/456013155"> recorded live workshop on imposter syndrome and how to combat it! </a> </p>
    </div>
</div>

---

## Review

Take quiz to review the recently covered material. Quizzes are _not_ graded.

In [10]:
try:
    display(L1P9Q1, L1P9Q2, L1P9Q3)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Mistakes are:\n', 'output_type': 'stream'},)), R…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. A problem in code that cannot be detected by the…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. Imposter Syndrome is caused by:\n', 'output_type…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 10 - Key Terms<a class="anchor" id="DS102L1_page_10"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

```c-lms
topic: Key Terms
```

Below is a list and short description of the important keywords you have learned in this lesson. Please, read through, go back and review any concepts you don't understand fully. Great Work!

<table class="table table-striped">
  <thead>
    <tr>
      <th>Tag</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="font-weight: bold;" nowrap>Conversion</td>
      <td>The process of changing or causing something to change from one form to another.</td>
    </tr>
    <tr>
      <td style="font-weight: bold;" nowrap>Logical Errors</td>
      <td>A mistake in a program's source code that results in incorrect or unexpected behavior.</td>
    </tr>
    <tr>
      <td style="font-weight: bold;" nowrap>Imposter Syndrome</td>
      <td>A psychological pattern in which an individual doubts their accomplishments and has a persistent internalized fear of being exposed as a "fraud."</td>
    </tr>
  </tbody>
</table>

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 11 - Lesson 1 Hands-On<a class="anchor" id="DS102L1_page_11"></a>

[Back to Top](#DS102L1_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

Lesson 1 Hands-On 45 points

## Directions

For this Hands-On, in the requirements you are given a problem statement. With this problem statement you will need to write out the steps needed to accomplish the task. Save your work in a text document named HandsOnL01 and be sure to upload your zipped file when you are finished. Although this hands on will not be graded, the best way to become a data scientist is to practice!

## Caution!
Do not submit your project until you have completed all requirements, as you will not be able to resubmit.

## Requirements
You have a bag filled with jellybeans of three different colors: pink, green, and yellow and 3 cups that cannot be moved arranged in a line in front of you. The end result should be that all the jellybeans have been removed from the bag and placed into cups. Each cup should only contain 1 color. The jellybeans should be placed into the cups so that the pile that is the smallest is on the left, and the pile that is the largest is on the right. Jellybeans cannot be placed anywhere except in the bag or in a cup.

## Caution!
Submit your HandsOnL01 zipped text document with your solution in the area below.

## Tip!
To zip your file on Windows, right click on the file and select "Send to", then select "Compressed (zipped) folder". For Mac users, right click on the file and select "Compress", then select your file from the options.

