# Lesson: Python for Data Science

## Introduction to Programming

**Programming is critical to any data science course, as it is a fundamental enabling technology; without programming or coding, we wouldn't be able to actually apply any of the things we're going to learn.**

[Python](https://www.python.org/) will be used (via [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) in particular because it is *relatively* simple and is widely applicable. It also has a huge (& growing!) user-base, which means there are many pre-made tools available which are perfect for data science-y things. 

Given our situation, we will assume no prior knowledge of programming or Python. If you ARE familiar w/ programming & Python, please bear w/ us; actually, let me know, as I may enlist your help in getting others up to speed. 

## What is Programming? 

Well, programming (or coding) is our way of communicating w/ a digital device, like a computer or a phone. Usually this means we're bossing it around b/c, while computers have plenty of processing power, they have no understanding of how to use that power (i.e. they have no intuition), so we must give them instructions. 

This is generally quite difficult because computers & humans speak different languages: [Machine code](https://en.wikipedia.org/wiki/Machine_code) versus (e.g.) English. You can think of machine code a little bit like [what Neo sees after he realizes he's the One](https://youtu.be/Vy7RaQUmOzE?t=198):

<img src="Figures/matrixcode.png" alt="Git & GitHub" width="750" />

Like I said, kinda... Machine code is strictly numeric in nature. It isn't meant for humans, it's meant to give precise & quick instructions to the [central processing unit (CPU)](https://en.wikipedia.org/wiki/Central_processing_unit). So what we need is some sort of way to translate machine code into something which is human-readable. 

You might be thinking: Why not English? Or French? Or Japanese? The primary reason is that what we normally think of as languages -- that is the constructs we use to communicate w/ one another -- are very inefficient when it comes to giving instructions. If I asked you to explain to me how to create a new TikTok (or w/e you kids are doin these days) in written form, it would probably take a lot longer than it would if you just showed me, right? Not only that, but I'm liable to get vastly different instructions from each of you. That isn't what we want... It's incredibly inefficient.\*

**This brings us to programming languages, which are a sort of in-between: They're developed with words & symbols we recognize and are structured in a way we recognize, but also in a way which is more efficient & can be easily translated into machine code for execution.** These translations are done by computer programs called [compilers](https://en.wikipedia.org/wiki/Compiler). So what does this all mean? Well, like traditional languages, programming languages have rules which define *words* and *grammar* -- **What we call syntax.** This syntax (which the compiler translates) includes definitions for tasks like addition, multiplication, file writing, picture generation, data extraction, matrix manipulation, etc., & it's what we need to learn in order to do data science. 

\*Caveat: Don't get me wrong, it isn't as if we can only successfully write code in a single way. But the logic of programming languages typically constrains us in such a way so that the [cardinality](https://en.wikipedia.org/wiki/Cardinality) of the set of options isn't infinite. 

## The Importance of Logic

[Logic -- tee hee](https://en.wikipedia.org/wiki/Logic_(rapper)) is truly the foundation of programming... 