# Detecting Trend Breaks with Semantic Signals

## Installation
To set up the project, first ensure that all dependencies are installed by using the provided `requirements.txt`.

In [1]:
pip install -r ./requirements.txt

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


## Datasets
The datasets required for this project will be downloaded into the `/data` directory, with subfolders for each dataset.

### Download Dataset I
Run the code below to download the *Tweet Sentiment's Impact on Stock Returns* dataset:

In [None]:
import os
import shutil
import subprocess
import kagglehub

# Ensure the ./data directory exists
os.makedirs("./data", exist_ok=True)

# Download the Tweet Sentiment's Impact on Stock Returns dataset
path = kagglehub.dataset_download("thedevastator/tweet-sentiment-s-impact-on-stock-returns")

# Move the downloaded dataset to the ./data directory and dynamically handle the folder name
if path and os.path.exists(path):
    destination = "./data/raw"
    new_name = "Impact_on_Stock_Returns"
    final_path = os.path.join(destination, new_name)

    # Ensure destination directory exists
    os.makedirs(destination, exist_ok=True)
    
    # Check if the file exists and move it
    shutil.move(path, destination)
    
    # Rename the moved folder/file dynamically
    original_path = os.path.join(destination, os.path.basename(path))
    if os.path.exists(original_path):
        os.rename(original_path, final_path)

    print("Path to dataset files:", final_path)
    print(f"Dataset moved and renamed successfully to {final_path}.")
else:
    print("Dataset path not found or download failed.")

Path to dataset files: ./data\Impact_on_Stock_Returns
Dataset moved and renamed successfully to ./data\Impact_on_Stock_Returns.


### Download Dataset II
Run the code below to download the *StockEmotions* dataset:

In [None]:
# Ensure the ./data directory exists
os.makedirs("./data", exist_ok=True)

# Clone the repository into a temporary location
temp_repo_path = "./temp_StockEmotions"
repo_url = "https://github.com/adlnlp/StockEmotions.git"

subprocess.run(["git", "clone", repo_url, temp_repo_path], check=True)

# Move the cloned repository to the ./data directory without .git
destination_path = "./data/raw/StockEmotions"
if os.path.exists(temp_repo_path):
    shutil.move(temp_repo_path, destination_path)

    # Remove the .git folder from the cloned repository using elevated permissions
    git_dir = os.path.join(destination_path, ".git")
    if os.path.exists(git_dir):
        def on_error(func, path, exc_info):
            # Change the file permissions and retry
            os.chmod(path, 0o777)
            func(path)
        shutil.rmtree(git_dir, onerror=on_error)

    print(f"Repository moved to {destination_path} and .git removed successfully.")
else:
    print("Failed to clone the repository.")

# Clean up the temporary path if it still exists
if os.path.exists(temp_repo_path):
    shutil.rmtree(temp_repo_path)

Repository moved to ./data/StockEmotions and .git removed successfully.
