<div style="background-color: #cfc ; padding: 20px; border-radius: 10px ; border: 2px solid green;">
<p><font size="+3"><b><center> qcware.qML </center></b></font>
</p>
<p>
<font size="+3">  <b> <center> The Quantum Machine Learning API</center></b></font>
</p>    
</div>

**qcware.qML** is Forge's API for Machine Learning applications. 

Our ambition is to deliver the first real Quantum Machine Learning applications by bridging the gap between higly impactful QML algorithms and the NISQ era quantum machines.


**Why QML?** 

First because ML provides extremely impactful tools for a number of different domains (Finance, Healthcare, Automotive, etc.), so it only makes sense to try as hard as possible and see how quantum can enhance these tools to offer Efficiency, Accuracy, Explainability, Trustworthiness, Energy savings etc. 

Second, we do know fault tolerant quantum computers can offer big advantages in ML, and we have contributed largely in the growing body of work in this area. 

Third, because we do have concrete avenues for bringing QML towards the NISQ era: by reducing resource requirements of impactful QML algorithms; by proposing NISQ Quantum Deep Learning with provable guarantees; even by pushing the boundaries of NISQ technologies to new QML-specific hardware architectures for overcoming bottlenecks.

Our QML methodology is NOT to nullify the algorithmic intuition and use blindly current hardware. It is both to push the boundaries of NISQ hardware and to reduce the algorithmic needs to get to the first real-world QML applications. 


**How do we do this?**

We needed to first solve a number of bottlenecks and get a brand new engine under the hood. How do we efficiently load classical data into quantum states? How do we extract useful information from quantum solutions fast? How do we calculate the similarity between data points, users, stock histories, patients on the fly? These are our propriatory "under the hood" developments that make our API stronger than ever and ready for a test drive. 

In the current release, we use our new powerful quantum tools for supervised and unsupervised learning and provide NISQ solutions for classification, regression and clustering.

The target audience of Forge is both the users who want to solve directly their use cases using our optimized quantum functionalities without having to worry too much about the quantum tools underneath, and also the users who want to start experimenting by themselves with building small quantum circuits and functions. 

But the best way to do this is to do this together! Some of our more powerful propriatory quantum tools make sense to be available in a more "supervised" setting, where QC Ware quantum algorithm experts accompany the users through Proof of Concept projects to step by step understand the power and limitations of these tools and how they can be optimally applied to each specific use case. 

In [None]:
from qcware.qml import fit_and_predict
# as always when using Forge, we must set our API key if running outside the hosted notebooks
import qcware
# qcware.config.set_api_key('QCWARE')
# if scikit-learn is not installed, install it; it should be installed on the Forge notebook servers
!pip install sklearn



<div style="background-color: #cfc ; padding: 10px; border-radius: 10px ; border: 1px solid green;">
<font size="+2"><b> 0. Under the hood </b></font>
</div>

So let's take a peek on what we built under the hood to ensure that we move closer towards these first QML applications.

<div style="background-color: #cfc ; padding: 8px; border-radius: 8px ; border: 0.5px solid green;">
<font size="+1"><b> 0.1 quasar </b></font>
</div> 

**quasar** is our internal quantum language which for a user who has experimented with other quantum languages will seem very intuitive and similar to use. It is a simple circuit writing tool that offers some big advantages: 

- Design your circuit once, run it on all available simulators and quantum hardware 
- Use powerful built-in functionalities specifically for Optimization, Chemistry and ML
- Multiple backends supported
- GPU powered simulators available


<div style="background-color: #cfc ; padding: 8px; border-radius: 8px ; border: 1px solid green;">
<font size="+1"><b> 0.2 Data loaders </b></font>
</div> 

One of the first obstacles towards practical QML was loading efficiently classical data onto quantum states that can be used for further computation.
Why do we need that? Well, because ML data IS and will predominantly remain classical (images, texts, preferences, stock market, internet data,...) so if we are looking for impactful quantum applications, ignoring classical data is not an option!

So what do we mean by **efficiently** loading classical data onto quantum states? We certainly  do not want to assume any exotic hardware technologies that are not or will not be available in the NISQ era. And more importantly we want to be able to create amplitude encodings of our classical data points as fast as possible, namely we want to be able to load classical data in the following sense:  

    - Read the classical data once! (time: O(n), for n-dimensional real-valued data)
    - Prepare quantum encodings of the data fast! (time: logn, circuit size (2-q gates): n)

And we deliver exactly that. We have designed data loaders that:

- loader(): Provide the **optimal** ways for loading classical data onto quantum states
- Can be deployed using current NISQ machines
- Can be readily used as subroutines to bring many QML algorithms closer to the NISQ era.  

We are also discussing with quantum hardware collaborators to design tailor-made hardware chips for even better performance and seemless integration. Stay tuned!

Our standard Forge users will be able to get advantage of our data loaders through the qML functionalities that we are offering both in the supervised and unsupervised learning. For the ones who would really want to spend more time and effort to understand the inner workings of the loaders and our other quantum tools, we are happy to accompany you on this journey. 

**Example**

Here is how the loaders work in very high level. Define your data point x as an 1d array. Normalize it. Call the loader to produce the circuit to create the quantum amplitude encoding of x. As simple as that. You can see how many qubits, what depth or how many gates the loader uses. Here is some sample code of what we do under the hood. These loaders are going to be used when you run your own QML applications in the sections below!

    # create a synthetic data point or upload your own data!

    import numpy as np
    x = np.random.rand(64)
    x = x/np.linalg.norm(x)

    # create a quasar circuit and call the loader function.
    # The loader function takes as input the classical data point
    # and mode 'parallel' or 'optimized' and it returns the quantum circuit

    parallel_loader = qio.loader(x, mode='parallel')

    # let's see what characteristics the circuit has
    print("Parallel loader characteristics")
    print("-------------------------------")
    print("number of qubits:",parallel_loader.nqubit)
    print("circuit depth   :",parallel_loader.ntime)
    print("number of gates :",parallel_loader.ngate,"\n")


    # let's do it for the 'optimized' loader as well

    optimized_loader = qio.loader(x,mode='optimized')

    print("Optimized loader characteristics")
    print("-------------------------------")
    print("number of qubits:",optimized_loader.nqubit)
    print("circuit depth   :",optimized_loader.ntime)
    print("number of gates :",optimized_loader.ngate,"\n")


Here are the characteristics of the loaders for 64-dimensional data
    
    >>> Parallel loader characteristics
    >>> -------------------------------
    >>> number of qubits: 64
    >>> circuit depth   : 7
    >>> number of gates : 64 
    >>> 
    >>> Optimized loader characteristics
    >>> -------------------------------
    >>> number of qubits: 16
    >>> circuit depth   : 34
    >>> number of gates : 72

<div style="background-color: #cfc ; padding: 8px; border-radius: 8px ; border: 1px solid green;">
<font size="+1"><b> 0.3 Distance Metrics Estimation </b></font>
</div> 

The main reason to create these data loaders is because they unlock a throve of important capabilities, the first of which is estimating the similarity or distance between data points in a fast way. Distance metrics are at the core of Similarity Learning, which is one of the fundamental branches of Supervised and Unsupervised Learning that provides accurate, efficient, and explainable AI applications.

Our Distance Metrics estimation functionalities combine in a clever way our loaders, and not much more, to create NISQ subroutines for Similarity Learning. More precisely, we have developed distance metrics subroutines that:

- distance_estimation(): Provide **optimal** Euclidean distance estimation of data points
- qdot(): Provide **optimal** Inner Product estimation between data points
- Can be deployed using current NISQ machines
- Can be readily used as subroutines to bring many QML algorithms closer to the NISQ era.


**Examples**

Here it is how it works. Define your data points x and y as 1d arrays. Call the distance_estimation prodecure to get an estimate of their squared Euclidean distance. The distance_estimation procedure constructs a quantum circuit and uses the designated backend (here the quasar simulator) to run the quantum circuit, get measurement results and output the estimate. Again, these circuits are going to be used when you run your own QML applications in the sections below.

    # create two random data points or load your own data
    
    import numpy as np
    x = np.random.rand(16)
    y = np.random.rand(16)

    # let's estimate the distance and compare it to the real one
    
    print('real distance',np.linalg.norm(x-y)**2)
    print('distance est.',qutils.distance_estimation(x, y,loader_mode='parallel'))

    # we can also see what characteristics the circuit has

    print("\nDistance Estimation characteristics")
    print("-------------------------------")
    print("number of qubits:",dist_est.nqubit)
    print("circuit depth:",dist_est.ntime)
    print("number of gates",dist_est.ngate)


Here is a sample output and the characteristics of the quantum circuit for 16-dimensional data
    
    >>> real distance 2.9635211340099668
    >>> distance est. 2.96147527209646
    >>>
    >>> Distance Estimation characteristics
    >>> -------------------------------
    >>> number of qubits: 17
    >>> circuit depth: 13
    >>> number of gates 35

We can also estimate the inner product between vectors or matrices in the same way. Let's see how.


    # create two random data matrices or load your own data
    import numpy as np
    x = np.random.rand(4,5)
    y = np.random.rand(5,3)

    # compute the dot product of the two matrices
    # and the quantum dot product estimation of the two matrices
    
    print(np.dot(x,y))
    print(qutils.qdot(x, y))

Here is a sample outcome
    
    >>> [[1.68310796 0.95850939 1.23604967]
    >>>  [1.71744922 1.08271752 1.4903697 ]
    >>>  [1.01983769 0.58927693 0.74201257]
    >>>  [0.61285301 0.24302302 0.88084614]]
    >>> [[1.68568646 0.95377326 1.25345586]
    >>>  [1.71279657 1.07006217 1.4773043 ]
    >>>  [1.01892272 0.58610302 0.74277502]
    >>>  [0.61225585 0.24286683 0.88671553]]

<div style="background-color: #cfc ; padding: 8px; border-radius: 8px ; border: 1px solid green;">
<font size="+1"><b> 0.4 Datasets and Visualization </b></font>
</div> 

There are plenty of ways to load datasets and visualize the results. Any method that works with scikit-learn will also work with our QML algorithms. Be careful, quantum simulations run out of memory pretty fast!

Here we just provide some simple code for generating synthetic data and plotting the results. Nothing extraordinary, but an easy way to start playing with our QML functionalities.

In [None]:
import numpy as np

def generate_data_clusters(n_clusters = 4, n_points = 8, dimension = 2, magnitude = 1, spread = 0.05, min_distance = 0.3, add_labels = False):
    """
    Generates data clusters containing npoints, with random centres and spreads.
    
    Args:
     n_clusters (int): Number of clusters to create
     n_points (numpy.array): array of number of points in each cluster. 
                             If int, same number of points in each cluster
     dimension (int): Number of features in data.
     magnitude: max magnitude of data points
     spread: spread of the normal distribution
     min_dist: minimum distance between cluster centers
     add_labels: True for supervised data, False for unsupervised data
    """
    if type(n_points) == int:
        n_points = np.tile(n_points,n_clusters)
     
    clusters = []
    
    if min_distance > 0:
        means = []
        while len(means) < n_clusters:
            mean = np.random.random(dimension) * magnitude 
            bools = []

            for i in range(len(means)):
                if np.linalg.norm(mean - means[i]) > min_distance:
                   bools.append(True)
                else: bools.append(False)
            if bools.count(False) > 0: 
                pass
            else: 
                means.append(mean)
                 
    else: means = np.random.random((n_clusters,dimension)) * magnitude
                
    for i in range(n_clusters):
        mean = np.array(means)[i]
        cov = np.identity(dimension) * spread
        clusters.append(np.random.multivariate_normal(mean = mean, cov=cov, size = n_points[i]))
    data = np.concatenate(clusters)
    
    labels = []
    for i in range(n_clusters):
        for j in range(n_points[i]):
            labels.append(i)
        
    if add_labels == True:

        return data, labels
    
    else: return data 

In [None]:
import matplotlib
import matplotlib.pyplot as plt 
from matplotlib import cm
import numpy as np

def plot(X,labels, model, y=None, T=None):
    """
    Plot quantum outcomes.
    
    Args: 
        X: training data
        y: labels of training data
        T: test data. If None, T=X
        labels: labels 
    """
    
    if model=='QNeighborsRegressor' or model=='KNeighborsRegressor':
        plt.scatter(X, y, color='darkorange', label='data')
        if T is None: T=X
        plt.plot(T, labels, color='navy', label='prediction')
        plt.axis('tight')
        plt.legend()
        plt.title(model)
        plt.tight_layout()
        plt.show()
        
    else:
        if np.shape(X)[1] != 2: raise ValueError('Only 2D data can be plotted')
        if T is None: T=X
        X_data, Y_data = np.hsplit(T,2)
        plt.scatter(X_data, Y_data, c = labels.reshape(np.shape(T)[0],-1))
        plt.axis('tight')
        plt.title(model)
        plt.tight_layout()
        plt.show()