## Using DataCleaner

During your first-day in a data-engineering job, you are tasked with processing several lists of financial data. These lists represent sequences of closing prices of a certain stock for the past 5 days. 

The problem with these lists however, is that they are "dirty." They contain improper data-types & missing data.

However, you've been given access to the proprietary and in-house `DataCleaner` class to assist you with this processing.

For problems 1 - 5, utilize the methods and attributes of `DataCleaner` to successfully accomplish these tasks.

Read the docstring, and utilize the test-coded provided to you to figure out how to use this class.

In [3]:
class DataCleaner:
    """Class that cleans data from list

	Attributes
	—---------
	data : list
        List of numbers. Could potentially contain missing or invalid data.

	Methods
	—------
    is_clean():
        Function that returns true if data only contains numerics. False otherwise.
	clean_data():
		Function that creates clean list of data by omitting missing data, and 
        converting numeric strings to floats.
	"""
    def __init__(self, data):
        self.data = data
    
    def is_clean(self):
        """Check if each element of data is a float or int"""
        for d in self.data:
            if not isinstance(d, (int, float)):
                return False
        return True
    
    def clean_data(self):
        """Create a new list and append only ints & floats, & converted float strings"""
        new_data = []
        for d in self.data:
            if isinstance(d, str) and d.replace(".", "", 1).isdigit():
                new_data.append(float(d))
            elif isinstance(d, (int, float)):
                new_data.append(d)
            else:
                continue
        return new_data

# Check behavior here
x = DataCleaner([230.33, 240, 250, "250.45", None, None, "Inf"])
new_x = x.clean_data()
# Notice: what did we omit? what did we keep?
new_x

[230.33, 240, 250, 250.45]

## Q1 

You've been given 5 lists of possibly erroneous financial data. For each list, utilize `is_clean` method to check if this data is "dirty."

Print out the status of each list to the console.

In [None]:
GOOG = ["99.57", 98.71, 98.05, 98.30, "99.71"]
AAPL = [140.09, 140.42, 138.98, 138.34, 142.99]
COUR = [11.33, "11.36", 11.51, 11.73, None]
GME = ["None", None, 25.27, 25.36, 25.56]
AMC = [6.53, 6.35, 6.12, 5.85, 6.04]

# write solution here

## Q2

For each "dirty" list, remove erroneous data using the `clean_data` function. Print out the clean version of this list to your console.

In [None]:
GOOG = ["99.57", 98.71, 98.05, 98.30, "99.71"]
AAPL = [140.09, 140.42, 138.98, 138.34, 142.99]
COUR = [11.33, "11.36", 11.51, 11.73, None]
GME = ["None", None, 25.27, 25.36, 25.56]
AMC = [6.53, 6.35, 6.12, 5.85, 6.04]

# write solution here

## Q3 

Next, save your clean data lists into new variables. Check to see that this new data is clean using the `is_clean` method. 

Be sure to print out the status of this data to your console.

In [None]:
GOOG = ["99.57", 98.71, 98.05, 98.30, "99.71"]
AAPL = [140.09, 140.42, 138.98, 138.34, 142.99]
COUR = [11.33, "11.36", 11.51, 11.73, None]
GME = ["None", None, 25.27, 25.36, 25.56]
AMC = [6.53, 6.35, 6.12, 5.85, 6.04]

# write solution here

## Q4 

Compare the lengths of your clean data to the lengths of your dirty data using the `len` function. Be sure to print these differences out to your console.s

In [None]:
GOOG = ["99.57", 98.71, 98.05, 98.30, "99.71"]
AAPL = [140.09, 140.42, 138.98, 138.34, 142.99]
COUR = [11.33, "11.36", 11.51, 11.73, None]
GME = ["None", None, 25.27, 25.36, 25.56]
AMC = [6.53, 6.35, 6.12, 5.85, 6.04]

# write solution here

## Q5

What do you notice about the lengths of your new "cleaned" lists? In your own opinion, is this change useful or harmful?

Describe answer here.

## Inheriting from a Class

After working with this `DataCleaner` class, you come up with the idea to add on more functionality. The senior engineer in charge of the codebase however is hesitant to let you change the original class, just in case the simplicity of `DataCleaner` will be needed again, but nonetheless agrees to allow you to implement a subclass.

Your task is to implement a subclass that inherits from `DataCleaner`, called `DataProcessor`. 

### DocString

You will create a docstring that explains what this new class does. Since this class does not take in any new attributes, you will not include an Attributes section.

However, you will include a `Methods` section to describe the following new classes:

* get_mean
* clean_data

## Methods

Your new methods will do the following:

* get_mean()
    * This new method will calculate the `average` of a list. Keep in mind, there is no guarantee that 
    the list should be clean when calculating this average. That is, you cannot just use the `sum()` function,
    since our list might contain erronous datatypes.
* clean_data()
    * This overrided method will work just like the previous `clean_data()`. However, instead of skipping bad data-types, we will instead replace the bad-data with the average of the list.

    Think about how you can incorperate `get_mean()` into this method.

In [None]:
# write class here



In [None]:
# test code: do not modify!

# Check behavior here
x = DataProcessor([230.33, 240, 250, "250.45", None, None, "Inf"])
new_x = x.clean_data()
# Noticeh how this is different from our previous output.
new_x