---
# Description

An exploratory data analysis of PCSO draw results

---


# Objective

This project only deals with **exploratory analysis**, where the main objective is understanding how the data is distributed and generate an insight for future reference. This particular analysis focuses on the Ultra Lotto 6/58.

<i>Disclaimer : Remember that lotteries are designed to be random, so while EDA can reveal patterns in historical data, it does not guarantee future success in predicting winning combinations. These analyses are mostly for curiosity and entertainment, and they should not be used as a basis for gambling.

Always gamble responsibly, and consider the odds and risks associated with playing the lottery.</i>

---

# Outline

*A.Data Preprocessing*

*B.Descriptive Analysis*
1. Frequency Distribution

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

pcso = pd.read_csv("~/Documents/data/pcso_data.csv",index_col=0)

In [6]:
pcso.head()

Unnamed: 0,LOTTO GAME,COMBINATIONS,DRAW DATE,JACKPOT (PHP),WINNERS
0,Superlotto 6/49,37-20-41-05-46-35,10/5/2023,109025555.2,0
1,Lotto 6/42,25-24-26-35-21-42,10/5/2023,21249596.4,0
2,6D Lotto,5-1-6-7-0-0,10/5/2023,872333.0,0
3,3D Lotto 2PM,4-8-2,10/5/2023,4500.0,203
4,3D Lotto 5PM,6-7-3,10/5/2023,4500.0,199


In [8]:
pcso.info()

<class 'pandas.core.frame.DataFrame'>
Index: 33024 entries, 0 to 33023
Data columns (total 5 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   LOTTO GAME     33024 non-null  object
 1   COMBINATIONS   33024 non-null  object
 2   DRAW DATE      33024 non-null  object
 3   JACKPOT (PHP)  33024 non-null  object
 4   WINNERS        33024 non-null  int64 
dtypes: int64(1), object(4)
memory usage: 1.5+ MB


In [11]:
nullVals = pcso.isnull().sum().sum()
print(nullVals)

0


>Therefore there are no null values

In [13]:
pcso.columns

Index(['LOTTO GAME', 'COMBINATIONS', 'DRAW DATE', 'JACKPOT (PHP)', 'WINNERS'], dtype='object')

In [14]:
uniqueVals = pcso["LOTTO GAME"].unique()
print(uniqueVals)

['Superlotto 6/49' 'Lotto 6/42' '6D Lotto' '3D Lotto 2PM' '3D Lotto 5PM'
 '3D Lotto 9PM' '2D Lotto 2PM' '2D Lotto 5PM' '2D Lotto 9PM'
 'Grand Lotto 6/55' 'Megalotto 6/45' '4D Lotto' 'Ultra Lotto 6/58'
 'Suertres Lotto 11:30AM' 'Suertres Lotto 12:30PM' 'Suertres Lotto 2PM'
 'EZ2 Lotto 2PM' 'EZ2 Lotto 11:30AM' 'EZ2 Lotto 12:30PM']


In [87]:
dfULotto = pcso[(pcso["LOTTO GAME"] == "Ultra Lotto 6/58")]
dfULotto.loc[:,"JACKPOT (PHP)"] = dfULotto["JACKPOT (PHP)"].str.replace(",","").astype(float)
dfSortedULotto = dfULotto.sort_values(by="JACKPOT (PHP)", ascending=False)
dfSortedULotto

Unnamed: 0,LOTTO GAME,COMBINATIONS,DRAW DATE,JACKPOT (PHP),WINNERS
14751,Ultra Lotto 6/58,40-50-37-25-01-45,10/14/2018,1180622508.0,2
14769,Ultra Lotto 6/58,28-14-54-50-17-27,10/12/2018,1112647388.0,0
14797,Ultra Lotto 6/58,12-16-46-03-38-36,10/9/2018,1026264340.0,0
14814,Ultra Lotto 6/58,45-21-02-30-07-10,10/7/2018,954503164.0,0
14832,Ultra Lotto 6/58,01-30-27-36-49-12,10/5/2018,903290152.0,0
...,...,...,...,...,...
11451,Ultra Lotto 6/58,45-43-02-47-13-58,10/20/2019,49500000.0,0
11468,Ultra Lotto 6/58,55-37-11-21-18-45,10/18/2019,49500000.0,0
11495,Ultra Lotto 6/58,13-23-09-58-38-14,10/15/2019,49500000.0,0
11514,Ultra Lotto 6/58,13-31-17-22-35-36,10/13/2019,49500000.0,0
