# 20-06-23: Daily Data Practice

---

## Daily Practices

* Practice with DS/ML tools and processes
  * [fast.ai course](https://course.fast.ai/)
  * Hands-on ML | NLP In Action | Dive Into Deep Learning | Coursera / guided projects
    * Read, code along, take notes
    * _test yourself on the concepts_ — i.e. do all the chapter exercises
  * Try to hit benchmark accuracies with [UCI ML datasets](https://archive.ics.uci.edu/ml/index.php) or Kaggle
* Coding & Problem Solving Practice
  * HackerRank SQL or Packt SQL Data Analytics
  * Python on HackerRank or similar platform
* Meta Data: Review and write
  * Focus on a topic, review notes and resources, write a blog post about it
* 2-Hour Job Search
* Interviewing
  * Behavioral questions and scenarios
  * Business case walk-throughs
  * Hot-seat DS-related topics for recall practice (under pressure)

---

## Coding & Problem Solving Practice

> Work through practice problems on HackerRank or similar

### Python

#### [Mini-Max Sum](https://www.hackerrank.com/challenges/mini-max-sum/problem)

Given five positive integers, find the minimum and maximum values that can be calculated by summing exactly four of the five integers. Then print the respective minimum and maximum values as a single line of two space-separated long integers. 

In [None]:
def miniMaxSum(arr):
    # My first thought is that I can simply sort the array then sum the first
    # four numbers for the min and last four for the max
    pass

In [1]:
arr = [1, 3, 5, 7, 9]
arr.sort()

mini = sum(arr[:-1])
maxi = sum(arr[-4:])

print(mini, maxi)

16 24


In [4]:
def miniMaxSum(arr):
    """Given five positive integers, finds the min and max values that can be
    calculated by summing exactly four of the five integers.
    """
    arr.sort()  # Make sure array is sorted
    print(sum(arr[:-1]), sum(arr[-4:]))

In [5]:
miniMaxSum(arr)

16 24


#### [Birthday Cake Candles](https://www.hackerrank.com/challenges/birthday-cake-candles/problem)

You are in charge of the cake for your niece's birthday and have decided the cake will have one candle for each year of her total age. When she blows out the candles, she’ll only be able to blow out the tallest ones. Your task is to find out how many candles she can successfully blow out.

For example, if your niece is turning 4 years old, and the cake will have 4 candles of height 4, 4, 1, 3, she will be able to blow out 2 candles successfully, since the tallest candles are of height 4 and there are 2 such candles.

Complete the function birthdayCakeCandles below. It must return an integer representing the number of candles she can blow out. 

In [None]:
# Complete the birthdayCakeCandles function below.
def birthdayCakeCandles(ar):
    # First idea is that I could use a counter data structure to count the
    # number of instances of each number, then just return the count of the
    # highest number (tallest candle) that has any instances.
    pass

In [8]:
# === Example === #
from collections import Counter

arr = [4, 4, 1, 3]

counter = Counter()
for item in arr:
    counter[item] += 1
    
counter.most_common(1)[0][1]

2

In [11]:
def birthdayCakeCandles(ar):
    counter = Counter()
    for item in ar:
        counter[item] += 1
    return counter.most_common(1)[0][1]

In [12]:
birthdayCakeCandles(arr)

2

#### [Time Conversion](https://www.hackerrank.com/challenges/time-conversion/problem)

For tomorrow.

### SQL

Applied SQL Analytics workshop on Packt.

#### Using Joins

The head of sales at your company would like a list of all customers who bought a car. We need to create a query that will return all customer IDs, first names, last names, and valid phone numbers of customers who purchased a car.

```SQL
SELECT
	c.customer_id,
	c.first_name,
	c.last_name,
	c.phone
FROM sales s
INNER JOIN customers c ON c.customer_id = s.customer_id
INNER JOIN products p ON p.product_id = s.product_id
WHERE p.product_type = 'automobile'
AND c.phone IS NOT NULL;
```

#### Subqueries

Subqueries are a way to use the tables produced by the SELECT queries instead of referencing an existing table in your database. You can simply take a query, insert it between a pair of parentheses, and give it an alias.

Find all the salespeople working in California:

```SQL
SELECT *
FROM salespeople
INNER JOIN (
  SELECT * FROM dealerships
  WHERE dealerships.state = 'CA'
  ) d
  ON d.dealership_id = salespeople.dealership_id
ORDER BY 1;
```

If a query only has one column, you can use a subquery with the IN keyword in a WHERE clause. To extract the details from the salespeople table using the dealership ID for the state of California:

```SQL
SELECT *
FROM salespeople
WHERE dealership_id IN (
  SELECT dealership_id from dealerships
  WHERE dealerships.state = 'CA'
  )
ORDER BY 1;
```

#### Unions

With joins, columns are added "horizontally" - _columns_ are added. Unions can keep the same number of columns but add together the rows of multiple queries.

To visualize the addresses of dealerships and customers using Google Maps, you would need both the addresses of customers and dealerships.

```SQL
(
  SELECT street_address, city, state, postal_code
  FROM customers
  WHERE street_address IS NOT NULL
)
UNION
(
  SELECT street_address, city, state, postal_code
  FROM dealerships
  WHERE street_address IS NOT NULL
)
ORDER BY 1;
```

Notes:

* The subqueries result in columns with the same names and data types
* `UNION` by default removes all duplicate rows in the output
  * `UNION ALL` retains the duplicate rows

##### Exercise 2.02

Make a guest list with ZoomZoom customers who live in Los Angeles, CA, as well as salespeople who work at the ZoomZoom dealership in Los Angeles, CA. The guest list should include first and last names and whether the guest is a customer or an employee.

```SQL
(
  SELECT first_name, last_name, 'Customer' as guest_type
  FROM customers
  WHERE city = 'Los Angeles'
  AND state = 'CA'
)
UNION
(
  SELECT first_name, last_name, 'Employee' as guest_type
  FROM salespeople s
  INNER JOIN dealerships d on d.dealership_id = s.dealership_id
  WHERE d.city = 'Los Angeles'
  AND d.state = 'CA'
)
```

---

## Reading

[Visual Guide to FastText Word Embeddings](https://amitness.com/2020/06/fasttext-embeddings/)



---

## 2-Hour Job Search

### Executive Summary

Soft skills/keywords

* Collaborative
* Lifelong learner
* Data-driven
* Dedicated
* Flexible / adaptable
* Persistent
* Broad-minded
* Growth mindset - always improving
* Candid / humble / honest
* Independent (self-driven)
* Motivated
* Story

Experiences

* Experience both working on and managing teams
* BS in Economics from Cal Poly, SLO
* Worked intimately with massive production databases

As a dedicated and passionate life-long learner, I take pride in my ability to learn both quickly and deeply, as needed to accomplish specific goals. Collaborative, with a flexible, broad mind and a persistent growth mindset, I've successfully managed projects and teams; timelines and expectations. I hold a BS in Economics from Cal Poly, SLO, where I was a Div I student-athlete and independent musician and writer.

As a machine learning engineer, data scientist, and software developer, I have expertise in all aspects of building and deploying production-grade machine learning systems using cutting-edge tools and processes. I've worked on a number of interdisciplinary teams in which my teammates and I were responsible for the entire process, from defining the business problem in terms of software and data to building data pipelines, conducting analyses, training and evaluating various algorithms, and deploying and maintaining production models.

I recognize the role that narrative has in shaping the world and each of our experiences in it, and believe data is a tool that can be used to tell the true stories.

I love building systems and processes that serve as practical solutions to complex problems.

#### Edit 2

As a dedicated and passionate life-long learner, I take pride in my ability to learn both quickly and deeply, as needed to accomplish specific goals. Collaborative, with a flexible, broad mind and a persistent growth mindset, I've successfully managed projects and teams; timelines and expectations. I hold a BS in Economics from Cal Poly, SLO, where I was a Div I student-athlete and independent musician and writer.

I've worked on a number of interdisciplinary teams building systems and processes to solve complex problems, and have expertise in all aspects of building and deploying production-grade machine learning systems using cutting-edge tools and processes.

---

## DS + ML Practice

* Pick a dataset and try to do X with it
  * Try to hit benchmark accuracies with [UCI ML datasets](https://archive.ics.uci.edu/ml/index.php) or Kaggle
* Practice with the common DS/ML tools and processes
  * Hands-on ML | NLP In Action | Dive Into Deep Learning | Coursera / guided projects
  * Machine learning flashcards

#### _The goal is to be comfortable explaining the entire process._

* Data access / sourcing, cleaning
  * Exploratory data analysis
  * Data wrangling techniques and processes
* Inference
  * Statistics
  * Probability
  * Visualization
* Modeling
  * Implement + justify choice of model / algorithm
  * Track performance + justify choice of metrics
    * Communicate results as relevant to the goal

---

## Meta Data: Review and Write

> Focus on a topic or project, learn/review the concepts, write (a blog post) about it

---

## Interviewing

> Practice answering the most common behavioral and technical interview questions

### Technical

* Business case walk-throughs
* Hot-seat DS-related topics for recall practice (under pressure)

### Behavioral

* "Tell me a bit about yourself"
* "Tell me about a project you've worked on and are proud of"
* "What do you know about our company?"
* "Where do you see yourself in 3-5 years?"
* "Why do you want to work here / want this job?"
* "What makes you most qualified for this role?"
* "What is your greatest strength/weakness?"
  * "What is your greatest technical strength?"
* "Tell me about a time when you had conflict with someone and how you handled it"
* "Tell me about a mistake you made and how you handled it"
* Scenario questions (STAR: situation, task, action, result)
  * Success story / biggest accomplishment
  * Greatest challenge (overcome)
  * Persuaded someone who did not agree with you
  * Dealt with and resolved a conflict (among team members)
  * Led a team / showed leadership skills or aptitude
  * How you've dealt with stress / stressful situations
  * Most difficult problem encountered in previous job; how you solved it
  * Solved a problem creatively
  * Exceeded expectations to get a job done
  * Showed initiative
  * Something that's not on your resume
  * Example of important goal you set and how you reached it
  * A time you failed
* "Do you have any questions for me?"
  * What is your favorite aspect of working here?
  * What has your journey looked like at the company?
  * What are some challenges you face in your position?