# Track Pattern Recognition using Linear Approximation of a Track

## Introduction

Track pattern recognition is an early step of the reconstruction of data coming from a particle detector. It recognizes tracks among the subdetectors hits. Reconstructed track parameters allow to estimate the particle deviation in a magnetic field, and thus reconstruct its charge and momentum. This information is used for the reconstruction of the decay vertex, to identify the mother particle and for further particle identification.

There is wide variety of the track pattern recognition methods. They differ in how they process the hits, what kind of tracks they are able to recognize and which requirements these tracks should satisfy. Therefore, specifics of an experiment and the detector geometry affect the tracking performance and track pattern recognition methods should be adapted to it accordingly.

In this notebook a track pattern recognition for a 2D detector with circular geometry and uniform magnetic field is considered. The detector schema with hits and tracks of an event is shown in the figure below. The challenge is to recognize tracks of an event with the highest efficiecny. It supposed that one hit can belong to only one track. 

<img src="pic/detector.png" /> <br>

## About this notebook

This notebook demonstrate how linear approximation of a track can be used for track pattern recognition. The notebook describes input data, the track pattern recognition method and qualyti metrics, and shows how to use them.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt

import pandas
import numpy

import user_test_submission as submission

In [2]:
#!sudo pip install sklearn==0.18.1

# Input data

In [3]:
name = "public_train"
data = pandas.read_csv('datasets/'+name+'.csv', index_col=False)
#data = data[data['event_id'].values < 100]

data.head()

Unnamed: 0,event_id,cluster_id,layer,iphi,x,y
0,3,4,4,53253,53.90043,-265.585662
1,3,1,5,37216,-47.614439,-402.191329
2,3,1,0,7181,-4.253919,-38.767308
3,3,3,2,7937,44.418132,148.499258
4,3,4,0,7657,7.5886,-38.254583


# Linear Approximation of a Track

This method is based on the linear approximation of a track. This is very simple method, so look the method script for details.

## Data Preparation

In [4]:
X = data[[u'layer', u'iphi', u'x', u'y']].values
y = data[[u'event_id', u'cluster_id']].values

## Train/Test Split

In [5]:
from sklearn.model_selection import train_test_split

event_ids = numpy.unique(data['event_id'].values)

event_ids_train, event_ids_test = train_test_split(event_ids, 
                                                   test_size=1000, 
                                                   random_state=42)

X_train, y_train = X[data['event_id'].isin(event_ids_train)], y[data['event_id'].isin(event_ids_train)]
X_test, y_test = X[data['event_id'].isin(event_ids_test)], y[data['event_id'].isin(event_ids_test)]

## Track Pattern Recognition

In [6]:
from clusterer import Clusterer
from sklearn.cluster import DBSCAN

ctr = Clusterer(cluster=DBSCAN(eps=0.05, min_samples=1))

ctr.fit(X_train, y_train)

In [7]:
%%time
from metrics import predictor

y_pred_test = predictor(ctr, X_test, y_test)

CPU times: user 2.22 s, sys: 11 ms, total: 2.23 s
Wall time: 2.25 s


## Score

In [8]:
score = submission.score_function(y_test, y_pred_test)
score

0.9423527736033346