<a href="https://colab.research.google.com/github/AnejVollmeier/Analiza-trga-kriptovalut-in-napoved-gibanja-cen/blob/main/Kriptovalute_Anej_Vollmeier.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Analiza trga kriptovalut in napoved gibanja cen**

##**Pridobivanje podatkov**

*Pridobimo podatke o kriptovalutah iz spletnega CoinGecko API-ja. Podatki se združijo v DataFrame, iz katerega izberemo samo pomembne stolpce. Na koncu podatke shranimo v CSV datoteko, ki jo uporabimo pri nadaljnji obdelavi.*

In [None]:
import requests
import pandas as pd
import time

all_data = []

for page in range(1, 4):  # 3 strani * 100 = 300
    url = "https://api.coingecko.com/api/v3/coins/markets"
    params = {
        "vs_currency": "eur",
        "order": "market_cap_desc",
        "per_page": 100,
        "page": page,
        "sparkline": "false",
        "price_change_percentage": "1h,24h,7d,30d,1y"
    }

    response = requests.get(url, params=params, timeout=30)

    # zaščita pred preveč zahtevki
    if response.status_code == 429:
        print("Preveč zahtevkov – čakam 10 sekund …")
        time.sleep(10)
        response = requests.get(url, params=params, timeout=30)

    response.raise_for_status()
    all_data.extend(response.json())

    time.sleep(2)  # obvezen zamik zaradi CoinGecko omejitev

df = pd.DataFrame(all_data)

columns = [
    "id",
    "symbol",
    "name",

    # rang
    "market_cap_rank",
    "market_cap",
    "fully_diluted_valuation",

    # cena
    "current_price",
    "total_volume",

    # volatilnost
    "high_24h",
    "low_24h",

    # spremembe
    "price_change_percentage_1h_in_currency",
    "price_change_percentage_24h_in_currency",
    "price_change_percentage_7d_in_currency",
    "price_change_percentage_30d_in_currency",
    "price_change_percentage_1y_in_currency",
]

# varno: izberi samo obstoječe stolpce
df = df.loc[:, [c for c in columns if c in df.columns]].copy()

df.to_csv(
    "coingecko_kriptovalute.csv",
    sep=";",
    decimal=",",
    index=False,
)

df.head()


Unnamed: 0,id,symbol,name,market_cap_rank,market_cap,fully_diluted_valuation,current_price,total_volume,high_24h,low_24h,price_change_percentage_1h_in_currency,price_change_percentage_24h_in_currency,price_change_percentage_7d_in_currency,price_change_percentage_30d_in_currency,price_change_percentage_1y_in_currency
0,bitcoin,btc,Bitcoin,1,1482758572631,1482758572631,74306.0,18508550000.0,75115.0,74093.0,-0.329729,-0.337,1.796631,-1.855093,-21.459037
1,ethereum,eth,Ethereum,2,298589789894,298589789894,2476.34,9756803000.0,2519.15,2472.89,-0.929359,-1.226551,2.668288,-3.579843,-25.610297
2,tether,usdt,Tether,3,158559572006,163197781005,0.848386,32995440000.0,0.848984,0.848054,0.026727,-0.058184,-0.493916,-1.816736,-11.65294
3,binancecoin,bnb,BNB,4,97700131248,97700131248,709.71,564502800.0,720.28,709.41,-0.669731,-1.435709,0.139741,-4.75463,5.734542
4,ripple,xrp,XRP,5,94776132360,156443802958,1.56,1221073000.0,1.6,1.57,-1.12829,-1.26086,0.872455,-17.905337,-29.184791


*Iz CSV datoteke naložimo podatke o kriptovalutah v DataFrame. Pri uvozu določimo ločilo, decimalni znak in stolpec **id** nastavimo kot indeks.*

In [None]:
df = pd.read_csv(
    "coingecko_kriptovalute.csv",
    sep=";",
    decimal=",",
    index_col = 0
)
df.head()

Unnamed: 0_level_0,symbol,name,market_cap_rank,market_cap,fully_diluted_valuation,current_price,total_volume,high_24h,low_24h,price_change_percentage_1h_in_currency,price_change_percentage_24h_in_currency,price_change_percentage_7d_in_currency,price_change_percentage_30d_in_currency,price_change_percentage_1y_in_currency
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
bitcoin,btc,Bitcoin,1,1482758572631,1482758572631,74306.0,18508550000.0,75115.0,74093.0,-0.329729,-0.337,1.796631,-1.855093,-21.459037
ethereum,eth,Ethereum,2,298589789894,298589789894,2476.34,9756803000.0,2519.15,2472.89,-0.929359,-1.226551,2.668288,-3.579843,-25.610297
tether,usdt,Tether,3,158559572006,163197781005,0.848386,32995440000.0,0.848984,0.848054,0.026727,-0.058184,-0.493916,-1.816736,-11.65294
binancecoin,bnb,BNB,4,97700131248,97700131248,709.71,564502800.0,720.28,709.41,-0.669731,-1.435709,0.139741,-4.75463,5.734542
ripple,xrp,XRP,5,94776132360,156443802958,1.56,1221073000.0,1.6,1.57,-1.12829,-1.26086,0.872455,-17.905337,-29.184791


##**Predprocesiranje podatkov**

*`df.shape` --> preverimo dimenzije DataFrame-a, torej število vrstic in stolpcev.*

In [None]:
df.shape

(300, 14)

*`df.info()` -->prikaže osnovne informacije o DataFrame-u, kot so imena stolpcev, tipi podatkov in število manjkajočih vrednosti.*

In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 300 entries, bitcoin to staked-frax-ether
Data columns (total 14 columns):
 #   Column                                   Non-Null Count  Dtype  
---  ------                                   --------------  -----  
 0   symbol                                   300 non-null    object 
 1   name                                     300 non-null    object 
 2   market_cap_rank                          300 non-null    int64  
 3   market_cap                               300 non-null    int64  
 4   fully_diluted_valuation                  300 non-null    int64  
 5   current_price                            300 non-null    float64
 6   total_volume                             292 non-null    float64
 7   high_24h                                 298 non-null    float64
 8   low_24h                                  298 non-null    float64
 9   price_change_percentage_1h_in_currency   297 non-null    float64
 10  price_change_percentage_24h_in_curr

*`df.isnull().sum()`-->prikaže število manjkajočih vrednosti v vsakem stolpcu DataFrame-a*

In [None]:
df.isnull().sum()

Unnamed: 0,0
symbol,0
name,0
market_cap_rank,0
market_cap,0
fully_diluted_valuation,0
current_price,0
total_volume,8
high_24h,2
low_24h,2
price_change_percentage_1h_in_currency,3


*Odstranimo stolpce, ki niso pomembni za nadaljnjo analizo ali vsebujejo preveč manjkajočih vrednosti.*

*  symbol(ni pomemben)
*   name(ni pomemben)
*   price_change_percentage_1y_in_currency(prevec mankajočih vrednosti)





In [None]:
df = df.drop(columns=[
    "symbol",
    "name",
    "price_change_percentage_1y_in_currency",
])

df.head(3)

Unnamed: 0_level_0,market_cap_rank,market_cap,fully_diluted_valuation,current_price,total_volume,high_24h,low_24h,price_change_percentage_1h_in_currency,price_change_percentage_24h_in_currency,price_change_percentage_7d_in_currency,price_change_percentage_30d_in_currency
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
bitcoin,1,1482758572631,1482758572631,74306.0,18508550000.0,75115.0,74093.0,-0.329729,-0.337,1.796631,-1.855093
ethereum,2,298589789894,298589789894,2476.34,9756803000.0,2519.15,2472.89,-0.929359,-1.226551,2.668288,-3.579843
tether,3,158559572006,163197781005,0.848386,32995440000.0,0.848984,0.848054,0.026727,-0.058184,-0.493916,-1.816736


*Izberemo vse številske stolpce in manjkajoče vrednosti zapolnimo z mediano. Nato preverimo, da v DataFrame-u ni več manjkajočih podatkov*

*   `median()`-->srednja vrednost v urejenem naboru podatkov



In [None]:
num = df.select_dtypes(exclude=object).columns
df[num] = df[num].fillna(df[num].median())

df.isnull().sum()

Unnamed: 0,0
market_cap_rank,0
market_cap,0
fully_diluted_valuation,0
current_price,0
total_volume,0
high_24h,0
low_24h,0
price_change_percentage_1h_in_currency,0
price_change_percentage_24h_in_currency,0
price_change_percentage_7d_in_currency,0


*Standardiziramo številske podatke z uporabo* **StandardScaler**

In [None]:
from sklearn.preprocessing import StandardScaler

df[num] = StandardScaler().fit_transform(df[num])

df.head(3)

Unnamed: 0_level_0,market_cap_rank,market_cap,fully_diluted_valuation,current_price,total_volume,high_24h,low_24h,price_change_percentage_1h_in_currency,price_change_percentage_24h_in_currency,price_change_percentage_7d_in_currency,price_change_percentage_30d_in_currency
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
bitcoin,-1.726287,16.766645,11.698139,4.20474,8.047751,4.072564,4.220861,0.356489,0.116568,-0.027328,0.115948
ethereum,-1.71474,3.295931,2.262106,-0.108638,4.18448,-0.110468,-0.107954,-0.030535,-0.163562,0.064793,0.07693
tether,-1.703193,1.702993,1.183236,-0.257292,14.442685,-0.255574,-0.257368,0.58656,0.20437,-0.269403,0.116816
