About Dataset

Global Crypto Currency Database

The Global Crypto Currency Database is a comprehensive and meticulously curated dataset that offers a wealth of information on over 7500 cryptocurrencies, each paired with the US Dollar (USD). This dataset is an invaluable resource for anyone interested in exploring the world of digital currencies and analyzing their market behavior. These not only include popular coins such as BTC, ETH, and SOL but it also captures newly released coins as well.
Dataset File Structure

The Dataset is structured with the following key fields:

Name: Coin Name

    Type: String
    Description: The official name of the cryptocurrency, enabling easy identification and reference to specific digital coins.

Symbol: Trading Symbol of the Coin

    Type: String
    Description: This field provides the unique trading symbol associated with each cryptocurrency, a vital element for traders and investors.

Date: Date of the Price

    Type: Datetime
    Description: Accurate time-stamping allows for precise tracking of cryptocurrency prices, facilitating trend analysis and historical comparisons.

Open: Opening Price of the Day

    Type: Number
    Description: The opening price signifies the value at which the cryptocurrency began trading on a particular day, offering insights into market sentiment.

High: Highest Price of the Day

    Type: Number
    Description: The highest price recorded during the day provides a glimpse into the cryptocurrency's peak performance within a given timeframe.

Low: Lowest Price of the Day

    Type: Number
    Description: The lowest price registered during the day offers a perspective on the cryptocurrency's lowest trading point within that period.

Close: Closing Price of the Day

    Type: Number
    Description: The closing price represents the cryptocurrency's final trading value for the day, crucial for assessing daily market performance.

Adj Close: Adjusted Closing Price of the Day

    Type: Number
    Description: This field accounts for various factors such as dividends and stock splits, offering a more accurate view of a cryptocurrency's closing price.

Potential Uses
1. Investment Analysis

Investors can leverage this dataset to analyze historical price trends, volatility, and correlations between cryptocurrencies and traditional assets. It helps in making informed investment decisions and managing risk in cryptocurrency portfolios.
2. Market Research

Market researchers can use this dataset to study market dynamics, identify emerging cryptocurrencies, and assess the impact of news events on cryptocurrency prices. It aids in understanding market sentiment and behavior.
3. Algorithmic Trading

Traders and quantitative analysts can develop algorithmic trading strategies based on historical price data. The dataset enables the backtesting of trading algorithms to assess their effectiveness.
4. Risk Management

Risk managers can assess the risk associated with cryptocurrency investments by analyzing historical price volatility and correlations with other asset classes.
5. Academic Research

Academics and researchers can use this dataset to conduct studies on various aspects of the cryptocurrency market, contributing to the academic understanding of digital currencies.
Conclusion

The Global Crypto Currency Database is a versatile and informative dataset that opens doors to a wide range of applications in the world of cryptocurrencies. Its structured and detailed information allows users to gain insights, make data-driven decisions, and explore the ever-evolving landscape of digital assets. Whether you're an investor, researcher, trader, or enthusiast, this dataset is a valuable tool for navigating the complexities of the cryptocurrency market.

In [None]:
import pandas as pd

In [None]:
import os

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.preprocessing import MinMaxScaler

In [None]:
meta = pd.read_csv('/kaggle/input/global-cryptocurrency-price-database/metadata.csv')
meta.head()

In [None]:
#example currency
currancy_name = 'Bitcoin USD'

In [None]:
file = meta.loc[meta['Coin Pair Name'] == currancy_name].iloc[:, 2][0]

In [None]:
data = pd.read_csv(os.path.join('/kaggle/input/global-cryptocurrency-price-database/data', file))

In [None]:
data

In [None]:
# Check if CUDA (GPU) is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

In [None]:
# Extract the "Close" prices (the target variable)
close_prices = data['Close'].values.reshape(-1, 1)

In [None]:
# Normalize the data using Min-Max scaling
scaler = MinMaxScaler()
close_prices_scaled = scaler.fit_transform(close_prices)

In [None]:
# Define a function to create input sequences and their corresponding target values
def create_sequences(data, seq_length):
    sequences = []
    target = []
    for i in range(len(data) - seq_length):
        seq = data[i:i + seq_length]
        label = data[i + seq_length]
        sequences.append(seq)
        target.append(label)
    return np.array(sequences), np.array(target)

In [None]:
# Set the sequence length and split the data into sequences and targets
sequence_length = 10  # You can adjust this value
sequences, target = create_sequences(close_prices_scaled, sequence_length)

In [None]:
# Convert data to PyTorch tensors
sequences = torch.tensor(sequences, dtype=torch.float32)
target = torch.tensor(target, dtype=torch.float32)

In [None]:
# Split the data into training and testing sets
split_ratio = 0.8
split_index = int(len(sequences) * split_ratio)
X_train, y_train = sequences[:split_index], target[:split_index]
X_test, y_test = sequences[split_index:], target[split_index:]

In [None]:
# Define the GRU model
class GRUModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(GRUModel, self).__init__()
        self.hidden_size = hidden_size
        self.gru = nn.GRU(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, _ = self.gru(x)
        out = self.fc(out[:, -1, :])
        return out

In [None]:
# Define hyperparameters
input_size = 1
hidden_size = 4
output_size = 1
num_epochs = 50
batch_size = 64
learning_rate = 0.001

In [None]:
# Create DataLoader for training and testing
train_dataset = TensorDataset(X_train, y_train)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, pin_memory=True)
test_dataset = TensorDataset(X_test, y_test)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, pin_memory=True)

In [None]:
# Initialize the model
model = GRUModel(input_size, hidden_size, output_size).to(device)

In [None]:
# Define the loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

In [None]:
# Define empty lists to store training history
train_loss_history = []
validation_loss_history = []

In [None]:
# Train the model
def train(model, train_loader, criterion, optimizer):
    model.train()
    running_loss = 0.0
    
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        inputs = inputs.to(device)
        labels = labels.to(device)
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    
    return running_loss / len(train_loader)




for epoch in range(num_epochs):
    # Calculate training loss
    train_loss = train(model, train_loader, criterion, optimizer)
    train_loss_history.append(train_loss)

    # Calculate validation loss
    #validation_loss = validate(model, validation_loader, criterion)
    #validation_loss_history.append(validation_loss)

# Update the training plot
plt.figure(figsize=(10, 5))
plt.plot(train_loss_history, label='Training Loss')
#plt.plot(validation_loss_history, label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.title('Training Progress')
plt.grid(True)
plt.show()

In [None]:
# Evaluate the model
model.eval()
test_loss = 0.0
with torch.no_grad():
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        test_loss += criterion(outputs, labels).item()

print(f"Test Loss: {test_loss}")

In [None]:
# Make predictions
model.eval()
with torch.no_grad():
    predictions = model(X_test.to(device))
    predictions = scaler.inverse_transform(predictions.cpu().numpy())

In [None]:
# Plot the original and predicted prices
plt.figure(figsize=(12, 6))
plt.plot(y_test, label='True Prices', color='blue')
plt.plot(predictions * (7 / 500000), label='Predicted Prices', color='red')
plt.legend()
plt.title('Stock Price Prediction with PyTorch GRU')
plt.xlabel('Time')
plt.ylabel('Price')
plt.show()