# General Image Encoder

## Summary

The gist here is that non-image problems can become image problems if we encode features as pixel channels, and then train an image classifier/regressor on the generated image.

## Part 0: Getting Data (handled by user)

### Summary

Get tabular data; user should ensure that all data is clean.

## Part 1: Feature Engineering

In this part, we have some tabular data (say, in a dataframe). Since this will be automated, we will brute-force create a large number of features that hopefully describe the data better than the raw input. We will do this by using the open source python module ```featuretools```.

## Part 2: Clustering Features and Assigning to Channels

### Steps
1. Ensure that the number of features, $m$, is divisible by the quantity $3\cdot4^n$ for some $n$. If not, go back to the __Feature Engineering__ step and create more features or remove features in order to satisfy this condition.
2. Let the data table with the features (in our case, ```pandas.DataFrame```) be called ```features```. Then we convert this to a numpy array, called $X$. The columns represent values for each feature, and each row represents one entry/example. Now note that in traditional cluster analysis, examples are clustered together by minimizing a distance metric, which is computed by finding the examples' different feature lengths. However, in this case, we actually want to group features by example values. Put another way, we want to group the synthetic features that are closest together in value, and the way we determine their similarity is by seeing how similar the values are for their various examples. Hence, in order to cluster the features, we will perform cluster analysis on $X^T$. When there is a large number of examples in $X$, then $X^T$ will have a large number of columns, implying that cluster analysis on $X^T$ will fall victim to the curse of dimensionality. Dimensionality reduction techniques may become helpful to reduce the number of columns (i.e. decrease the number of examples to be used). One option is to perform PCA and get only the most distinct columns of $X^T$.
3. Run the function ```fn()``` below.

In [None]:
from BalancedClusters import BalancedClusters
def fn(x_transpose, location_descriptor):
    if len(num_examples(x_transpose) == 3):
        # populate the three channels at the specified location in the picture
        pixel_location = get_pixel_to_populate(location_descriptor)
        populate_pixel(pixel_location)
    else:
        # get balanced clusters for each quadrant
        clusters = make_clusters(data=x_transpose, num_clusters=4) # must be a dictionary from name to 2D numpy array
        max_iterations = 1000
        cluster_balancer = BalancedClusters(clusters,'optimal')
        balanced_clusters = cluster_balancer.balance_clusters(max_iterations=max_iterations, verbose='none')
        
        # get quadrant locations
        quadrant_locations = get_location_descriptors(location_descriptor)
        
        assert len(quadrant_locations) == len(balanced_clusters.keys()) == 4
        # call fn recursively on each quadrant
        for i, key in enumerate(balanced_clusters.keys()): 
            fn(balanced_clusters[key], quadrant_locations[i])

### Idea for feature selection/dimensionality reduction (in general):
    - Transpose of regular feature array ($X$ -> $X^T$)
    - K clusters
    - See what examples of $X^T$ are in what groups (i.e. what columns of $X$ are most similar)
    - Perform some sort of averaging for each group
    - Proceed with only K features    

## Part 3: Wrapping into an API

## Part 4: Make Predictions! (handled by user)