# EuroMillions

EuroMillions is a transnational lottery that requires seven correct numbers to win the jackpot. It was launched on February 7, 2004.
Drawings are held every Tuesday and Friday night at 20:45 CET in Paris. A standard EuroMillions ticket costs €2.50, £2.50 or CHF3.50 per line played, but this depends on the local currency.

This notebook intends to present useful algorithms to find a winning combination of numbers.

The player must select:
- __5__ main numbers, which can be any number from 1 to 50
- __2__ different lucky star numbers from 1 to 12

## Setup

In [1]:
%matplotlib inline

import numpy as np # linear algebra
import pandas as pd # data processing
import matplotlib.pyplot as plt # data visualization
import seaborn as sns # data visualization
from sklearn.preprocessing import LabelEncoder
import math
#from mpl_toolkits.mplot3d import Axes3D

# Models
#from sklearn.model_selection import train_test_split
#from sklearn import metrics
#from sklearn.metrics import mean_absolute_error
#from sklearn import decomposition
#from sklearn import datasets

In [9]:
df = pd.read_csv('euro.csv', low_memory = False)

In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1234 entries, 0 to 1233
Data columns (total 14 columns):
No.          1234 non-null object
 Day         1233 non-null object
DD           1233 non-null object
MMM          1233 non-null object
YYYY         1233 non-null object
 N1          1233 non-null object
N2           1233 non-null object
N3           1233 non-null object
N4           1233 non-null object
N5           1233 non-null object
L1           1233 non-null object
L2           1233 non-null object
  Jackpot    1233 non-null object
   Wins      1233 non-null object
dtypes: object(14)
memory usage: 135.0+ KB


In [12]:
df.head()

Unnamed: 0,No.,Day,DD,MMM,YYYY,N1,N2,N3,N4,N5,L1,L2,Jackpot,Wins
0,1232,Fri,12,Jul,2019,2,31,39,45,47,8,4,81180289,0
1,1231,Tue,9,Jul,2019,32,7,41,36,29,7,6,73547916,0
2,1230,Fri,5,Jul,2019,20,2,42,34,9,9,6,67458174,0
3,1229,Tue,2,Jul,2019,19,44,11,29,45,12,8,60173279,0
4,1228,Fri,28,Jun,2019,1,33,3,49,16,2,11,51583856,0


In [4]:
#DEBUG
#df[' N1'].value_counts()
#df[' N1'].value_counts().max()

35

In [17]:
#Filter by year
df = df.loc[df['YYYY'] == '2019']

In [18]:
P1 = df[['No.',' N1']]
R1 = P1.groupby(' N1').count()
P1_MAX = P1.groupby(' N1').count().max()
P1_MAX

R1 = R1.loc[P1_MAX[0]==R1['No.']]
R1['#'] = 'P1'
R1

Unnamed: 0_level_0,No.,#
N1,Unnamed: 1_level_1,Unnamed: 2_level_1
19,5,P1


In [19]:
P2 = df[['No.','N2']]
R2 = P2.groupby('N2').count()
P2_MAX = P2.groupby('N2').count().max()
P2_MAX

R2 = R2.loc[P2_MAX[0]==R2['No.']]
R2['#'] = 'P2'
R2

Unnamed: 0_level_0,No.,#
N2,Unnamed: 1_level_1,Unnamed: 2_level_1
16,4,P2


In [20]:
P3 = df[['No.','N3']]
R3 = P3.groupby('N3').count()
P3_MAX = P3.groupby('N3').count().max()
P3_MAX

R3 = R3.loc[P3_MAX[0]==R3['No.']]
R3['#'] = 'P3'
R3

Unnamed: 0_level_0,No.,#
N3,Unnamed: 1_level_1,Unnamed: 2_level_1
26,4,P3


In [21]:
P4 = df[['No.','N4']]
R4 = P4.groupby('N4').count()
P4_MAX = P4.groupby('N4').count().max()
P4_MAX

R4 = R4.loc[P4_MAX[0]==R4['No.']]
R4['#'] = 'P4'
R4

Unnamed: 0_level_0,No.,#
N4,Unnamed: 1_level_1,Unnamed: 2_level_1
25,4,P4
39,4,P4


In [22]:
P5 = df[['No.','N5']]
R5 = P5.groupby('N5').count()
P5_MAX = P5.groupby('N5').count().max()
P5_MAX

R5 = R5.loc[P5_MAX[0]==R5['No.']]
R5['#'] = 'P5' 
R5

Unnamed: 0_level_0,No.,#
N5,Unnamed: 1_level_1,Unnamed: 2_level_1
1,3,P5
6,3,P5
29,3,P5
39,3,P5


In [23]:
L1 = df[['No.','L1']]
LR1 = L1.groupby('L1').count()
L1_MAX = L1.groupby('L1').count().max()
L1_MAX

LR1 = LR1.loc[L1_MAX[0]==LR1['No.']]
LR1['#'] = 'L1' 
LR1

Unnamed: 0_level_0,No.,#
L1,Unnamed: 1_level_1,Unnamed: 2_level_1
2,11,L1


In [24]:
L2 = df[['No.','L2']]
LR2 = L2.groupby('L2').count()
L2_MAX = L2.groupby('L2').count().max()
L2_MAX

LR2 = LR2.loc[L2_MAX[0]==LR2['No.']]
LR2['#'] = 'L2'
LR2

Unnamed: 0_level_0,No.,#
L2,Unnamed: 1_level_1,Unnamed: 2_level_1
6,9,L2


In [25]:
R1['N1'] = R1.index
R2['N1'] = R2.index
R3['N1'] = R3.index
R4['N1'] = R4.index
R5['N1'] = R5.index
LR1['N1'] = LR1.index
LR2['N1'] = LR2.index

In [26]:
frames = [R1, R2, R3, R4, R5, LR1, LR2]

result = pd.concat(frames)
result

Unnamed: 0,No.,#,N1
19,5,P1,19
16,4,P2,16
26,4,P3,26
25,4,P4,25
39,4,P4,39
1,3,P5,1
6,3,P5,6
29,3,P5,29
39,3,P5,39
2,11,L1,2


In [27]:
result.rename(columns={'No.':'Count','N1':'Number'}, inplace=True)
result = result.reset_index(drop=True)
result

Unnamed: 0,Count,#,Number
0,5,P1,19
1,4,P2,16
2,4,P3,26
3,4,P4,25
4,4,P4,39
5,3,P5,1
6,3,P5,6
7,3,P5,29
8,3,P5,39
9,11,L1,2
