<div style="display:flex;justify-content:center">
    <img src="https://unram.ac.id/wp-content/uploads/2018/09/UNRAM-LOGO-FIX-STATUTA-.png" style="height: 100px" /> 
    <img src="https://s3.dualstack.us-east-2.amazonaws.com/pythondotorg-assets/media/community/logos/python-logo-only.png" style="height: 100px"/> 
</div>


<h1 align="center">Pemodelan dan Simulasi Monte Carlo dalam Meningkatkan Pendapatan Penjualan Peralatan Motor</h1>

# Table of Content
<ul style="list-style:none">
    <li>
        <a href="#data-awal">Menampilkan data awal</a>
    </li>
    <li>
        <a href="#preprocessing">Melakukan uji Proses Preprocessing Terhadap Data</a>
    </li>
    <li>Melakukan uji hipotesis</li>
    <li>Melakukan uji normalitas</li>
    <li>Melakukan uji homogenitas</li>
    <li>Memilih salah satu metode analisis</li>
</ul>
<hr>

Hal pertama yang harus dilakukan yaitu melakukan import library-library yang dibutuhkan dalam proses pengolahan data. Library-library tersebut sebagai berikut.
1. Pandas :
Pandas adalah library yang digunakan untuk manipulasi dan analisis data. Pandas menyediakan struktur data yang efisien dan fleksibel, yaitu DataFrames, yang memungkinkan untuk mengolah data dengan mudah.
2. NumPy :
NumPy (Numerical Python) adalah library yang digunakan untuk melakukan komputasi numerik dengan Python. NumPy menyediakan struktur data array multidimensi yang efisien dan operasi matematika yang cepat pada array tersebut
3. Matplotlib :
Matplotlib adalah library yang digunakan untuk membuat visualisasi data dengan Python. Matplotlib menyediakan berbagai fungsi dan alat untuk membuat grafik dan plot yang menarik dan informatif. 

In [83]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

<h1 id="data-awal">Data Awal</h1>

Pada tahap ini hal pertama yang akan dilakukan adalah untuk mengimport sebuah file excel yang akan digunakan sebagai sumber data yang akan diolah, kemudian menampilkan data tersebut menggunakan library pandas.

In [84]:
excel_path = "dataset.xlsx"
df = pd.read_excel(excel_path,usecols=[1,2,3,4]).astype(str)
df.head(12)

Unnamed: 0,Bulan,PM1,PM2,PM3
0,Jan,28,25,15
1,Feb,15,23,18
2,Mar,22,26,10
3,Apr,9,5,10
4,Mei,30,28,16
5,Jun,27,23,14
6,Jul,35,25,18
7,Agu,28,20,18
8,Sep,18,22,18
9,Okt,25,18,10


<h1 id="preprocessing">Uji Proses Preprocessing </h1>

Proses uji preprocessing adalah tahapan dalam pengolahan data di mana data mentah atau data yang belum siap digunakan untuk analisis atau pemodelan dipersiapkan dan dibersihkan agar lebih sesuai untuk digunakan dalam proses selanjutnya.

Tujuan dari proses uji preprocessing adalah untuk meningkatkan kualitas dan kegunaan data dengan menghilangkan ketidaksesuaian, kekacauan, atau kecacatan yang mungkin ada dalam data mentah.

## 1. Format Penulisan

Konversi tipe data adalah salah satu langkah yang umum dilakukan dalam tahap uji preprocessing untuk memastikan bahwa data memiliki format yang sesuai untuk analisis atau pemodelan yang akan dilakukan.

Proses konversi tipe data melibatkan mengubah jenis data dari satu format ke format lain yang lebih cocok untuk keperluan tertentu.

In [85]:
df["PM1"] = df["PM1"].str.replace('[a-zA-Z]', '')
df["PM2"] = df["PM2"].str.replace('[a-zA-Z]', '')
df["PM3"] = df["PM3"].str.replace('[a-zA-Z]', '')
df["PM1"] = df["PM1"].astype(int)
df["PM2"] = df["PM2"].astype(int)
df["PM3"] = df["PM3"].astype(int)
df.dtypes

  df["PM1"] = df["PM1"].str.replace('[a-zA-Z]', '')
  df["PM2"] = df["PM2"].str.replace('[a-zA-Z]', '')
  df["PM3"] = df["PM3"].str.replace('[a-zA-Z]', '')


Bulan    object
PM1       int32
PM2       int32
PM3       int32
dtype: object

## 2. Data Kosong

Mengatasi data kosong atau nilai yang hilang merupakan langkah penting dalam tahap uji preprocessing. Data kosong atau nilai yang hilang dapat mempengaruhi kualitas analisis atau pemodelan yang dilakukan.

Salah satu pendekatan yang sederhana adalah dengan menghapus baris atau kolom yang mengandung nilai yang kosong. Namun, pendekatan ini harus digunakan dengan hati-hati karena dapat menyebabkan kehilangan informasi yang penting jika data yang kosong cukup signifikan.

In [86]:
df = df.dropna()
df.head(12)

Unnamed: 0,Bulan,PM1,PM2,PM3
0,Jan,28,25,15
1,Feb,15,23,18
2,Mar,22,26,10
3,Apr,9,5,10
4,Mei,30,28,16
5,Jun,27,23,14
6,Jul,35,25,18
7,Agu,28,20,18
8,Sep,18,22,18
9,Okt,25,18,10


<h1 id="hipotesis">Uji Hipotesis</h1>

In [87]:
h0 = "Tidak ada perbedaan yang signifikan antara penjualan peralatan motor"
h1 = "Terdapat ada perbedaan yang signifikan antara penjualan peralatan motor"

t_statistic, p_value = stats.f_oneway(df['PM1'], df['PM2'], df['PM3'])
alpha = 0.05

print("Hasil Uji Hipotesis:","\nNilai p-value:",p_value)
if p_value < alpha:
    print("Hipotesis nol ditolak",h1)
else:
    print("Hipotesis nol diterima",h0)

Hasil Uji Hipotesis: 
Nilai p-value: 0.002475154321574239
Hipotesis nol ditolak Terdapat ada perbedaan yang signifikan antara penjualan peralatan motor


<h1 id="analisis">Metode Analisis Monte Carlo</h1>

### a. Distribusi Probabilitas

In [88]:
def calculaterProbabilitas(items):
    dp = []
    for item in items:
        dp.append(item/sum(items))
    return dp

df["DP PM1"] = calculaterProbabilitas(df["PM1"])
df["DP PM2"] = calculaterProbabilitas(df["PM2"])
df["DP PM3"] = calculaterProbabilitas(df["PM3"])
df.head(20)

Unnamed: 0,Bulan,PM1,PM2,PM3,DP PM1,DP PM2,DP PM3
0,Jan,28,25,15,0.096886,0.09542,0.081967
1,Feb,15,23,18,0.051903,0.087786,0.098361
2,Mar,22,26,10,0.076125,0.099237,0.054645
3,Apr,9,5,10,0.031142,0.019084,0.054645
4,Mei,30,28,16,0.103806,0.10687,0.087432
5,Jun,27,23,14,0.093426,0.087786,0.076503
6,Jul,35,25,18,0.121107,0.09542,0.098361
7,Agu,28,20,18,0.096886,0.076336,0.098361
8,Sep,18,22,18,0.062284,0.083969,0.098361
9,Okt,25,18,10,0.086505,0.068702,0.054645


### b. Kumulatif

In [89]:
def calculateKumulatif(probs):
    kum = []
    kum.append(round(probs[0],2))
    for i in range(1,len(probs)):
        kum.append(round(kum[i-1]+probs[i],2))
    return kum

df["Kum PM1"] = calculateKumulatif(df["DP PM1"])
df["Kum PM2"] = calculateKumulatif(df["DP PM2"])
df["Kum PM3"] = calculateKumulatif(df["DP PM3"])
df.head(20)


Unnamed: 0,Bulan,PM1,PM2,PM3,DP PM1,DP PM2,DP PM3,Kum PM1,Kum PM2,Kum PM3
0,Jan,28,25,15,0.096886,0.09542,0.081967,0.1,0.1,0.08
1,Feb,15,23,18,0.051903,0.087786,0.098361,0.15,0.19,0.18
2,Mar,22,26,10,0.076125,0.099237,0.054645,0.23,0.29,0.23
3,Apr,9,5,10,0.031142,0.019084,0.054645,0.26,0.31,0.28
4,Mei,30,28,16,0.103806,0.10687,0.087432,0.36,0.42,0.37
5,Jun,27,23,14,0.093426,0.087786,0.076503,0.45,0.51,0.45
6,Jul,35,25,18,0.121107,0.09542,0.098361,0.57,0.61,0.55
7,Agu,28,20,18,0.096886,0.076336,0.098361,0.67,0.69,0.65
8,Sep,18,22,18,0.062284,0.083969,0.098361,0.73,0.77,0.75
9,Okt,25,18,10,0.086505,0.068702,0.054645,0.82,0.84,0.8


### c.  Interval Angka Random

In [90]:
def generateInterval(kums):
    intervalBottom = []
    intervalTop = []
    intervalBottom.append(1)
    intervalTop.append(int(kums[0]*100))
    for i in range(1,len(kums)):
        intervalBottom.append(intervalTop[i-1]+1)
        intervalTop.append(int(kums[i]*100))
    return intervalBottom,intervalTop

df["Interval Bottom Rand PM1"],df["Interval Top Rand PM1"] = generateInterval(df["Kum PM1"])
df["Interval Bottom Rand PM2"],df["Interval Top Rand PM2"] = generateInterval(df["Kum PM2"])
df["Interval Bottom Rand PM3"],df["Interval Top Rand PM3"] = generateInterval(df["Kum PM3"])
df.head(20)

Unnamed: 0,Bulan,PM1,PM2,PM3,DP PM1,DP PM2,DP PM3,Kum PM1,Kum PM2,Kum PM3,Interval Bottom Rand PM1,Interval Top Rand PM1,Interval Bottom Rand PM2,Interval Top Rand PM2,Interval Bottom Rand PM3,Interval Top Rand PM3
0,Jan,28,25,15,0.096886,0.09542,0.081967,0.1,0.1,0.08,1,10,1,10,1,8
1,Feb,15,23,18,0.051903,0.087786,0.098361,0.15,0.19,0.18,11,15,11,19,9,18
2,Mar,22,26,10,0.076125,0.099237,0.054645,0.23,0.29,0.23,16,23,20,28,19,23
3,Apr,9,5,10,0.031142,0.019084,0.054645,0.26,0.31,0.28,24,26,29,31,24,28
4,Mei,30,28,16,0.103806,0.10687,0.087432,0.36,0.42,0.37,27,36,32,42,29,37
5,Jun,27,23,14,0.093426,0.087786,0.076503,0.45,0.51,0.45,37,45,43,51,38,45
6,Jul,35,25,18,0.121107,0.09542,0.098361,0.57,0.61,0.55,46,56,52,61,46,55
7,Agu,28,20,18,0.096886,0.076336,0.098361,0.67,0.69,0.65,57,67,62,69,56,65
8,Sep,18,22,18,0.062284,0.083969,0.098361,0.73,0.77,0.75,68,73,70,77,66,75
9,Okt,25,18,10,0.086505,0.068702,0.054645,0.82,0.84,0.8,74,82,78,84,76,80


### d. Pembentukan Bilangan Acak

Bilangan acak didapatkan melalui rumus:

Pi = (a * Pi-1 + c) mod m

Dimana:<br>
Pi-1 = Bilangan acak sebelumnya<br>
Pi = Bilangan acak ke-i dari sebelumnya<br>
a = Faktor pengali<br>
c = penambah<br>
mod = angka modulo<br>
i = 1,2,3,4,5,6,7,8,9,…,n<br>

Diketahui: Pi = 23, a = 3, c = 13, m = 100

In [91]:
excel_path = "dataset.xlsx"
dfOriginal = pd.read_excel(excel_path,usecols=[1,2,3,4]).astype(str)
dfOriginal.head(12)

Unnamed: 0,Bulan,PM1,PM2,PM3
0,Jan,28,25,15
1,Feb,15,23,18
2,Mar,22,26,10
3,Apr,9,5,10
4,Mei,30,28,16
5,Jun,27,23,14
6,Jul,35,25,18
7,Agu,28,20,18
8,Sep,18,22,18
9,Okt,25,18,10


In [92]:
def formula(a,pi,c,m): 
    return (a*pi+c)%m
def generateRand(pm):
    pi = 23
    a = 3
    c = 13
    m = 100
    rand = []
    rand.append(int(formula(a,pi,c,m))) 
    for i in range(1,len(pm)):
        rand.append(int(formula(a,rand[i-1],c,m)))
    return rand

dfOriginal["Random"] = generateRand(dfOriginal["PM1"])
dfOriginal.head(20)
    

Unnamed: 0,Bulan,PM1,PM2,PM3,Random
0,Jan,28,25,15,82
1,Feb,15,23,18,59
2,Mar,22,26,10,90
3,Apr,9,5,10,83
4,Mei,30,28,16,62
5,Jun,27,23,14,99
6,Jul,35,25,18,10
7,Agu,28,20,18,43
8,Sep,18,22,18,42
9,Okt,25,18,10,39


### e. Simulasi

In [93]:
dfSimulation = pd.DataFrame(dfOriginal["Bulan"])
dfSimulation["Random"] = dfOriginal["Random"]
dfSimulation.head(20)
    

Unnamed: 0,Bulan,Random
0,Jan,82
1,Feb,59
2,Mar,90
3,Apr,83
4,Mei,62
5,Jun,99
6,Jul,10
7,Agu,43
8,Sep,42
9,Okt,39


In [94]:
def simulation(rands,intervalBottom,intervalTop,pm):
    lenght = len(intervalBottom)
    sim = []
    for i in range(lenght):
        for j in range(lenght):
            if(intervalBottom[j] <= rands[i] <= intervalTop[j]):
                sim.append(pm[j])
                break
    return sim
dfSimulation["PM1"] = simulation(dfSimulation["Random"],df["Interval Bottom Rand PM1"],df["Interval Top Rand PM1"],df["PM1"])
dfSimulation["PM2"] = simulation(dfSimulation["Random"],df["Interval Bottom Rand PM2"],df["Interval Top Rand PM2"],df["PM2"])
dfSimulation["PM3"] = simulation(dfSimulation["Random"],df["Interval Bottom Rand PM3"],df["Interval Top Rand PM3"],df["PM3"])
dfSimulation.head(20)

Unnamed: 0,Bulan,Random,PM1,PM2,PM3
0,Jan,82,25,18,16
1,Feb,59,28,25,18
2,Mar,90,32,22,20
3,Apr,83,32,18,16
4,Mei,62,28,20,18
5,Jun,99,20,25,20
6,Jul,10,28,25,18
7,Agu,43,27,23,14
8,Sep,42,27,28,14
9,Okt,39,27,28,14


### f. Perhitungan Akurasi

In [95]:
real = []
real.append(sum(df["PM1"]))
real.append(sum(df["PM2"]))
real.append(sum(df["PM3"]))
real.append(sum(real))
simulasi = []
simulasi.append(sum(dfSimulation["PM1"]))
simulasi.append(sum(dfSimulation["PM2"]))
simulasi.append(sum(dfSimulation["PM3"]))
simulasi.append(sum(simulasi))

akurasi = pd.DataFrame()
akurasi["Kode Peralatan Motor"] = ["PM1","PM2","PM3","Total : "]
akurasi["Data Real"] = real
akurasi["Data Simulasi"] = simulasi

akurasi.head(10)

Unnamed: 0,Kode Peralatan Motor,Data Real,Data Simulasi
0,PM1,289,332
1,PM2,262,262
2,PM3,183,199
3,Total :,734,793


In [108]:
def calculateAccuracy(real,simulasi):
    akurasi = []
    differences = []
    for i in range(len(real)-1):
        differences.append(abs(real[i]-simulasi[i]))
        akurasi.append(min(real[i],simulasi[i])/max(real[i],simulasi[i]))
    differences.append(sum(differences))
    akurasi.append(np.mean(akurasi))
    for i in range(len(akurasi)):
        akurasi[i] = str(int(akurasi[i]*100)) + "%"
    return differences,akurasi

akurasi["Selisih"],akurasi["Akurasi"] = calculateAccuracy(akurasi["Data Real"], akurasi["Data Simulasi"])
akurasi.head(10)

Unnamed: 0,Kode Peralatan Motor,Data Real,Data Simulasi,Selisih,Akurasi
0,PM1,289,332,43,87%
1,PM2,262,262,0,100%
2,PM3,183,199,16,91%
3,Total :,734,793,59,93%
