# Lab 6: Strings

## Introduction 

Data professionals have worked with a lot of string data. For example, while analyzing the results of a marketing campaign, I may have needed to review item descriptions or customer names, which were stored as string data. Becoming comfortable working with strings in Python has been essential for the work of a data professional.

In this lab, I have practiced coding in Python and working with strings. I have worked with a store ID, ZIP Code, and a custom URL for the store I've been gathering data on.

## Task 1: Check and change data types

Now that you have experience in marketing, you've moved on to market research. Your new task is collecting store data for future analysis. In this task, you're given a four-digit numeric store ID stored in a variable called `store_id`.

1.  Convert `store_id` to a string and store the result in the same variable.
2.  Confirm the type of `store_id` after the conversion.



In [1]:
store_id = 1101

# 1. ### YOUR CODE HERE ###
store_id = str(store_id)

# 2. ### YOUR CODE HERE ###
print(type(store_id))

<class 'str'>


## Task 2: String concatenation

As you continue gathering data, you realize that the `store_id` variable is actually the ZIP Code where the store is located, but the leading `0` has been cut off.

*  Define a function called `zip_checker` that accepts the following argument:
    *  `zipcode` - a string with either four or five characters

*  Return:
    *  If `zipcode` has five characters, and the first two characters are NOT `'00'`, return `zipcode` as a string. Otherwise, return `'Invalid ZIP Code.'`. (ZIP Codes do not begin with 00 in the mainland U.S.)
    *  If `zipcode` has four characters and the first character is NOT `'0'`, the function must add a zero to the beginning of the string and return the five-character `zipcode` as a string.
    *  If `zipcode` has four characters and the first character is `'0'`, the function must return `'Invalid ZIP Code.'`.

*Example:*

```
 [IN] zip_checker('02806')
[OUT] '02806'

 [IN] zip_checker('2806')
[OUT] '02806'

 [IN] zip_checker('0280')
[OUT] 'Invalid ZIP Code.'

 [IN] zip_checker('00280')
[OUT] 'Invalid ZIP Code.'
```

**Note that there is more than one way to solve this problem.**

In [2]:
### YOUR CODE HERE ###
def zip_checker(zipcode):
    if len(zipcode) == 5:
        if zipcode[0:2] =='00':
            return 'Invalid ZIP Code.'
        else:
            return zipcode
    elif zipcode[0] != '0':
        zipcode = '0' + zipcode
        return zipcode
    else:
        return 'Invalid ZIP Code.'

In [3]:
zip_checker('02806')

'02806'

In [4]:
zip_checker('2806')

'02806'

In [5]:
zip_checker('0280')

'Invalid ZIP Code.'

In [6]:
'Invalid ZIP Code.'

'Invalid ZIP Code.'

## Task 3: Extract the store ID

Now imagine that you've been provided `url`, which is a URL containing the store's actual ID at the end of it.

1.  Extract the seven-character store ID from the end of `url` and assign the result to a variable called `id`.
2.  Print the contents of `id`.

In [7]:
url = "https://exampleURL1.com/r626c36"

# 1. ### YOUR CODE HERE ###
id = url[-7:]

# 2. ### YOUR CODE HERE ###
print(id)

r626c36


## Task 4: String extraction function

You have many URLs that contain store IDs, but many of them are invalid&mdash;either because they use an invalid protocol (the beginning of the URL) or because the store ID is not seven characters long.

*  The correct URL protocol is `https:` Anything else is invalid.
*  A valid store ID must have exactly seven characters.



Define a function called `url_checker` that accepts the following argument:
*  `url` - a URL string

Return:
*  If both the protocol and the store ID are invalid:
    * print two lines: <br/>
    `'{protocol} is an invalid protocol.'` <br/>
    `'{store_id} is an invalid store ID.'` <br/>
*  If only the protocol is invalid:
    * print: <br/>
    `'{protocol} is an invalid protocol.'` <br/>
*  If only the store ID is invalid:
    * print: <br/>
        `'{store_id} is an invalid store ID.'` <br/>
*  If both the protocol and the store ID are valid, return the store ID.

In the above cases, `{protocol}` is a string of the protocol and `{store_id}` is a string of the store ID.

*Example:*

```
 [IN] url_checker('http://exampleURL1.com/r626c3')
[OUT] 'http: is an invalid protocol.'
      'r626c3 is an invalid store ID.'

 [IN] url_checker('ftps://exampleURL1.com/r626c36')
[OUT] 'ftps: is an invalid protocol.'

 [IN] url_checker('https://exampleURL1.com/r626c3')
[OUT] 'r626c3 is an invalid store ID.'

 [IN] url_checker('https://exampleURL1.com/r626c36')
[OUT] 'r626c36'
```

**Note that there is more than one way to solve this problem.**

# Sample valid URL for reference while writing your function:
url = 'https://exampleURL1.com/r626c36'

### YOUR CODE HERE ###
def url_checker(url):
    url = url.split('/')
    protocol = url[0]
    store_id = url[-1]
    # If both protocol and store_id bad
    if protocol != 'https:' and len(store_id) != 7:
        print(f'{protocol} is an invalid protocol.',
            f'\n{store_id} is an invalid store ID.')
    # If just protocol bad
    elif protocol != 'https:':
        print(f'{protocol} is an invalid protocol.')
    # If just store_id bad
    elif len(store_id) != 7:
        print(f'{store_id} is an invalid store ID.')
    # If all ok
    else:
        return store_id

In [8]:
# Sample valid URL for reference while writing your function:
url = 'https://exampleURL1.com/r626c36'

### YOUR CODE HERE ###
def url_checker(url):
    url = url.split('/')
    protocol = url[0]
    store_id = url[-1]
    # If both protocol and store_id bad
    if protocol != 'https:' and len(store_id) != 7:
        print(f'{protocol} is an invalid protocol.',
            f'\n{store_id} is an invalid store ID.')
    # If just protocol bad
    elif protocol != 'https:':
        print(f'{protocol} is an invalid protocol.')
    # If just store_id bad
    elif len(store_id) != 7:
        print(f'{store_id} is an invalid store ID.')
    # If all ok
    else:
        return store_id

In [9]:
url_checker('http://exampleURL1.com/r626c3')

http: is an invalid protocol. 
r626c3 is an invalid store ID.


In [10]:
url_checker('ftps://exampleURL1.com/r626c36')

ftps: is an invalid protocol.


In [11]:
url_checker('ftps://exampleURL1.com/r626c36')

ftps: is an invalid protocol.


In [12]:
url_checker('https://exampleURL1.com/r626c36')

'r626c36'

## Conclusions

* Strings have been instrumental in storing important data, such as unique identifiers.
* I have used string concatenation to help me combine data that is stored in different strings.
* I have found string formatting to be useful when inserting specific values into reusable string templates.
* Functions have boosted efficiency by allowing me to reuse code for repeated tasks.

