## Software Engineering for Data Scientists in Python
* Modularity
* Documentation
* Automated Testing

### Python Modularity

In [13]:
# import the numpy package
import numpy as np

# create an array class object
arr = np.array([8,6,7,5,3,0,9])

# use arr sort method
arr.sort()

# print sorted array
print(arr)

[0 3 5 6 7 8 9]


* Note here and reminder of the np.sort method returning a NoneType and modifying the array in place

In [14]:
type(arr.sort())

NoneType

### Leveraging documentation
When writing code for Data Science, it's inevitable that you'll need to install and use someone else's code. You'll quickly learn that using someone else's code is much more pleasant when they use good software engineering practices. In particular, good documentation makes the right way to call a function obvious. In this exercise you'll use python's help() method to view a function's documentation so you can determine how to correctly call a new method.

```python
# load the Counter function into our environment
from collections import Counter

# View the documentation for Counter.most_common
print(help(Counter.most_common))

most_common(self, n=None)
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.
    
    >>> Counter('abracadabra').most_common(3)
    [('a', 5), ('b', 2), ('r', 2)]
```
* We can see the modules most_common method and how the Counter takes arguments and the return type of each counted object being listed in a list of tuples

```python
# use Counter to find the top 5 most common words
top_5_words = Counter(words).most_common(n=5)

# display the top 5 most common words
print(top_5_words)

# Output
[('@DataCamp', 299), ('to', 263), ('the', 251), ('in', 164), ('RT', 158)]

```

### Using pycodestyle
We saw earlier that pycodestyle can be run from the command line to check a file for PEP 8 compliance. Sometimes it's useful to run this kind of check from a Python script.

```python
# Import needed package
import pycodestyle

# Create a StyleGuide instance
style_checker = pycodestyle.StyleGuide()

# Run PEP 8 check on multiple files
result = style_checker.check_files(['nay_pep8.py', 'yay_pep8.py'])

# Print result of PEP 8 style check
print(result.messages)

```

```python
nay_pep8.py:1:1: E265 block comment should start with '# '
nay_pep8.py:2:6: E225 missing whitespace around operator
nay_pep8.py:4:2: E131 continuation line unaligned for hanging indent
nay_pep8.py:5:6: E131 continuation line unaligned for hanging indent
nay_pep8.py:6:1: E122 continuation line missing indentation or outdented
nay_pep8.py:7:1: E265 block comment should start with '# '
nay_pep8.py:8:1: E402 module level import not at top of file
nay_pep8.py:9:1: E265 block comment should start with '# '
nay_pep8.py:10:1: E302 expected 2 blank lines, found 0
nay_pep8.py:10:18: E231 missing whitespace after ','
nay_pep8.py:11:2: E111 indentation is not a multiple of 4
nay_pep8.py:12:2: E111 indentation is not a multiple of 4
nay_pep8.py:14:1: E265 block comment should start with '# '
nay_pep8.py:15:1: E305 expected 2 blank lines after class or function definition, found 1
nay_pep8.py:16:11: E111 indentation is not a multiple of 4
nay_pep8.py:16:11: E117 over-indented
nay_pep8.py:16:17: E225 missing whitespace around operator
nay_pep8.py:16:32: E222 multiple spaces after operator
nay_pep8.py:16:32: E251 unexpected spaces around keyword / parameter equals
nay_pep8.py:16:38: E231 missing whitespace after ','
nay_pep8.py:16:44: E221 multiple spaces before operator
nay_pep8.py:16:44: E251 unexpected spaces around keyword / parameter equals
nay_pep8.py:16:47: E251 unexpected spaces around keyword / parameter equals
nay_pep8.py:17:11: E111 indentation is not a multiple of 4
nay_pep8.py:17:17: E201 whitespace after '('
nay_pep8.py:17:25: E202 whitespace before ')'
nay_pep8.py:17:27: W292 no newline at end of file
```

In [16]:
def print_phrase(phrase, polite=True, shout=False):
    if polite:# It's generally polite to say please
        phrase = 'Please ' + phrase

    if shout:  #All caps looks like a written shout
        phrase = phrase.upper() + '!!'

    print(phrase)


#Politely ask for help
print_phrase('help me', polite=True)
 # Shout about a discovery
print_phrase('eureka', shout=True)


Please help me
PLEASE EUREKA!!


* See the comment and type error that would be raised
```python
my_script.py:2:15: E261 at least two spaces before inline comment
my_script.py:5:16: E262 inline comment should start with '# '
my_script.py:11:1: E265 block comment should start with '# '
my_script.py:13:2: E114 indentation is not a multiple of four (comment)
my_script.py:13:2: E116 unexpected indentation (comment)
```

In [15]:
def print_phrase(phrase, polite=True, shout=False):
    if polite:  # It's generally polite to say please
        phrase = 'Please ' + phrase

    if shout:  # All caps looks like a written shout
        phrase = phrase.upper() + '!!'

    print(phrase)


# Politely ask for help
print_phrase('help me', polite=True)
# Shout about a discovery
print_phrase('eureka', shout=True)


Please help me
PLEASE EUREKA!!


* We can see the inline comments here are off and would be caught with the `StyleChecker` instance and use of the method above in the previous exercise

### `Writing a Python Module`
* What are the minimal requirements to make an import-able python package?
    - A Directory with a blank file named **__init__.py**
    
<br>

#### Naming Convention
* PEP 8 instructs that package names be all lowercase and only use underscores when it improves readability

```python
# example import from (text_analyzer, textAnalyzer, TextAnalyzer, & __text_analyzer__)

# easiesnt and right convention
import text_analyzer
```

* Recognizing packages
The structure of your directory tree is printed below. You'll be working in the file my_script.py that you can see in the tree.

```python
recognizing_packages
├── MY_PACKAGE
│&nbsp;&nbsp; └── _init_.py
├── package
│&nbsp;&nbsp; └── __init__.py
├── package_py
│&nbsp;&nbsp; └── __init__
│&nbsp;&nbsp;     └── __init__.py
├── py_package
│&nbsp;&nbsp; └── __init__.py
├── pyackage
│&nbsp;&nbsp; └── init.py
└── my_script.py
```

* Package Selection

```python
# Import local packages
import py_package
import package

# View the help for each package
help(py_package)
help(package)

```

### Adding functionality to your package
In the file counter_utils.py, you will write 2 functions to be a part of your package: plot_counter and sum_counters. The structure of your package can be seen in the tree below. For the coding portions of this exercise, you will be working in the file counter_utils.py.

```python
text_analyzer
├── __init__.py
└── counter_utils.py
```

In [17]:
# Import needed functionality
from collections import Counter

def plot_counter(counter, n_most_common=5):
  # Subset the n_most_common items from the input counter
  top_items = counter.most_common(n_most_common)
  # Plot `top_items`
  plot_counter_most_common(top_items)


In [18]:
# Import needed functionality
from collections import Counter

def sum_counters(counters):
  # Sum the inputted counters
  return sum(counters, Counter())


* You just wrote two functions for your package in the file counter_utils.py named plot_counter & sum_counters. 

Which of the following lines would correctly import these functions in __init__.py using relative import syntax?


* from counter_utils import plot_counter, sum_counters

* **from .counter_utils import plot_counter, sum_counters**

* from . import plot_counter, sum_counters

* from .counter_utils import plot_counter & sum_counters



### Using your package's new functionality

You've now created some great functionality for text analysis to your package. In this exercise, you'll leverage your package to analyze some tweets written by DataCamp & DataCamp users.

The object word_counts is loaded into your environment. It contains a list of Counter objects that contain word counts from a sample of DataCamp tweets.

* Structure
```python
working_dir
├── text_analyzer
│    ├── __init__.py
│    ├── counter_utils.py
└── my_script.py
```

```python
# Import local package
import text_analyzer

# Sum word_counts using sum_counters from text_analyzer
word_count_totals = text_analyzer.sum_counters(word_counts)

# Plot word_count_totals using plot_counter from text_analyzer
text_analyzer.plot_counter(word_count_totals)
```