Skip to content

techiepookie/cryptosightsai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

162 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ CryptoSights: Advanced Cryptocurrency Intelligence Platform

Python 3.10+ Flask Docker License PRs Welcome Hugging Face

An intelligent NLP-powered platform for cryptocurrency news analysis, sentiment tracking, and query-based intelligence.

🎯 Features β€’ πŸš€ Quick Start β€’ πŸ“– Documentation β€’ 🐳 Docker β€’ πŸ”Œ API β€’ 🀝 Contributing


πŸ“‹ Table of Contents


🎯 Overview

CryptoSights is a production-grade cryptocurrency intelligence platform that combines web scraping, natural language processing, and machine learning to deliver actionable insights from cryptocurrency news articles. The platform processes thousands of articles across major cryptocurrencies, performs advanced sentiment analysis, and provides instant, ranked answers to user queries through an intuitive web interface and RESTful API.

Supported Cryptocurrencies

  • Bitcoin (BTC) - Digital gold and store of value
  • Ethereum (ETH) - Smart contract pioneer
  • Solana (SOL) - High-performance blockchain
  • Dogecoin (DOGE) - Community-driven memecoin
  • Hamster (HMSTR) - Emerging cryptocurrency
  • General Crypto - Cross-asset analysis

Core Capabilities

Capability Description
Intelligent Scraping Automated article collection from cryptocurrency news sources
NLP Processing Advanced text extraction, cleaning, and preprocessing
Sentiment Analysis Real-time sentiment tracking with polarity scoring
Smart Retrieval BM25-powered ranked answer generation
Web Interface Interactive query platform with visual analytics
RESTful API Programmatic access for integration
Validation Engine Machine learning-based answer verification
Docker Support Containerized deployment with one-command setup

✨ Key Features

πŸ” Data Acquisition Pipeline

  • Automated Web Scraping

    • Harvests 10+ articles per cryptocurrency from trusted sources
    • Intelligent URL extraction with validation
    • Built-in rate limiting and exponential backoff
    • User-agent rotation to prevent blocking
    • Structured CSV output for audit trails and reproducibility
  • Robust Error Handling

    • Automatic retry mechanisms with configurable backoff
    • Comprehensive logging system for debugging
    • Graceful degradation on network failures
    • Detailed error reports in query logs

πŸ“„ Text Processing Engine

  • Multi-Stage Content Extraction

    • HTML to PDF conversion via pdfkit with wkhtmltopdf
    • Clean text extraction using trafilatura library
    • Boilerplate removal (ads, navigation, footers)
    • Character encoding normalization (UTF-8)
  • Advanced Preprocessing

    • Intelligent lowercasing with acronym preservation (BTC, ETH, DeFi)
    • Special character handling and normalization
    • Tokenization with cryptocurrency term protection
    • Stemming and lemmatization via NLTK
    • Stop word removal with domain-specific exceptions
  • Domain-Aware Processing

    • Cryptocurrency-specific keyword preservation
    • Technical term recognition (DeFi, NFT, PoS, DAO)
    • Context-sensitive cleaning for financial terms
    • Preserves numerical values for price analysis

πŸ’­ Sentiment Analysis System

  • Sentence-Level Analysis

    • TextBlob-powered sentiment scoring
    • Polarity measurement (-1.0 to +1.0 scale)
    • Subjectivity detection (0.0 to 1.0)
    • Multi-class classification (Positive, Negative, Neutral)
    • Per-document aggregate sentiment
  • Visual Analytics

    • Real-time sentiment distribution charts
    • Matplotlib-powered visualizations
    • Sentiment trends over time
    • Exportable graphics for reporting
    • JSON sentiment metrics export

🎯 Query Retrieval Engine

  • BM25 Algorithm Implementation

    • Okapi BM25 ranking function (k1=1.5, b=0.75)
    • Efficient inverted index structure
    • Sub-second query response times
    • Tunable ranking parameters
    • Persistent index serialization with pickle
  • Rich Query Results

    • Ranked answer ordering by relevance score (0.0-1.0)
    • Source document attribution with metadata
    • Sentiment context for each result
    • Configurable result count (top-n)
    • Duplicate result filtering
    • Query execution time tracking

🌐 Web Interface

  • User-Friendly Dashboard

    • Cryptocurrency selection with autocomplete
    • Real-time query processing with loading indicators
    • Interactive result display with collapsible cards
    • Sentiment visualization integration
    • Query history and saved searches (planned)
  • Responsive Design

    • Mobile-optimized layouts (Bootstrap 5)
    • Cross-browser compatibility (Chrome, Firefox, Safari, Edge)
    • Modern CSS3/HTML5 standards
    • Dark mode support (optional)
    • Accessibility features (WCAG 2.1 AA)

πŸ”Œ RESTful API

  • JSON-Based Communication

    • Standardized request/response format
    • CORS-enabled for cross-origin access
    • Comprehensive error messages with status codes
    • Rate limiting ready (planned)
    • API versioning support
  • Query Logging & Analytics

    • Automatic query history tracking
    • Timestamp-based audit trail
    • Analytics-ready data format
    • Performance metrics collection
    • User engagement tracking

πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      CryptoSights Platform                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚                               β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Data Pipeline  β”‚              β”‚  Web Layer   β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚                              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”
    β”‚           β”‚           β”‚          β”‚       β”‚       β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”   β”Œβ”€β”€β–Όβ”€β”€β”€β”   β”Œβ”€β”€β–Όβ”€β”€β”€β”   β”Œβ”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”
β”‚Scraperβ”‚   β”‚ PDF  β”‚   β”‚ NLP  β”‚   β”‚Flask β”‚ β”‚REST  β”‚ β”‚Docker β”‚
β”‚Module β”‚   β”‚Converβ”‚   β”‚Engineβ”‚   β”‚ UI   β”‚ β”‚ API  β”‚ β”‚Deploy β”‚
β””β”€β”€β”€β”¬β”€β”€β”€β”˜   β””β”€β”€β”¬β”€β”€β”€β”˜   β””β”€β”€β”¬β”€β”€β”€β”˜   β””β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”¬β”€β”€β”€β”€β”˜
    β”‚          β”‚          β”‚          β”‚       β”‚         β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚                       β”‚
        β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”              β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
        β”‚  BM25  β”‚              β”‚Sentimentβ”‚
        β”‚Indexingβ”‚              β”‚Analyzer β”‚
        β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜              β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
            β”‚                      β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚        Query Processing Layer         β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚                      β”‚
        β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
        β”‚ Validation β”‚      β”‚   Results   β”‚
        β”‚   Engine   β”‚      β”‚ Formatting  β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pipeline Flow

  1. Data Acquisition β†’ Web scraping β†’ URL collection β†’ CSV storage
  2. Content Processing β†’ HTML to PDF β†’ Text extraction β†’ Clean output
  3. NLP Processing β†’ Tokenization β†’ Lemmatization β†’ Term preservation
  4. Indexing β†’ BM25 index creation β†’ Persistence β†’ Memory optimization
  5. Query Handling β†’ User input β†’ BM25 ranking β†’ Result formatting
  6. Validation β†’ Semantic matching β†’ Relevance scoring β†’ Quality assurance

πŸ› οΈ Technology Stack

Core Technologies

Category Technologies
Language Python 3.10+
Web Frameworks Flask 2.0+ & FastAPI
NLP Libraries NLTK, TextBlob, Sentence-Transformers
ML Algorithms BM25Okapi, Random Forest, TF-IDF
Data Processing Pandas, NumPy
Web Scraping BeautifulSoup4, Requests, Trafilatura
Visualization Matplotlib, Seaborn
Containerization Docker, Docker Compose
Deployment Hugging Face Spaces, Docker Hub

Detailed Dependencies

# Web Framework & API
flask>=2.0.0
flask-cors>=3.0.10
fastapi>=0.95.0

# Web Scraping & Content Extraction
requests>=2.28.0
beautifulsoup4>=4.11.0
trafilatura>=1.4.0
pdfkit>=1.0.0
lxml>=4.9.0

# Natural Language Processing
nltk>=3.8.0
textblob>=0.17.0
sentence-transformers>=2.2.0

# Machine Learning & Retrieval
rank-bm25>=0.2.2
scikit-learn>=1.2.0

# Data Manipulation
pandas>=1.5.0
numpy>=1.23.0

# Visualization
matplotlib>=3.6.0
seaborn>=0.12.0

# Utilities
python-dotenv>=0.21.0

External Dependencies


πŸš€ Quick Start

Option 1: Traditional Python Installation

Prerequisites

  • Python 3.10 or higher
  • pip (Python package manager)
  • wkhtmltopdf
  • Git

Installation Steps

1. Clone the Repository

git clone https://github.com/techiepookie/crypto-insight-ai.git
cd crypto-insight-ai

2. Create Virtual Environment (Recommended)

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

3. Install Python Dependencies

pip install -r requirements.txt

4. Download NLTK Resources

python -c "import nltk; \
    nltk.download('punkt'); \
    nltk.download('punkt_tab'); \
    nltk.download('stopwords'); \
    nltk.download('wordnet'); \
    nltk.download('omw-1.4')"

5. Install wkhtmltopdf

# Ubuntu/Debian
sudo apt-get install wkhtmltopdf

# macOS
brew install wkhtmltopdf

# Windows: Download from https://wkhtmltopdf.org/downloads.html

6. Verify Installation

python -c "import flask, nltk, textblob, rank_bm25; print('βœ“ All dependencies installed!')"

7. Start the Application

cd crypto_sights
python run.py

Access at: http://localhost:5000


🐳 Docker Deployment

Option 2: Docker (Recommended for Production)

Requires Docker installed: https://docs.docker.com/get-docker/

Quick Start

# Pull pre-built image from Docker Hub
docker pull techiepookie/cryptosights:latest

# Run container
docker run -d \
  --name cryptosights \
  -p 7860:7860 \
  techiepookie/cryptosights:latest

# Access application
# http://localhost:7860

Local Build & Deploy

# Build Docker image
docker build -t cryptosights .

# Run container
docker run -d \
  --name cryptosights \
  -p 7860:7860 \
  techiepookie/cryptosights:latest

# View logs
docker logs -f cryptosights

# Stop container
docker stop cryptosights
docker rm cryptosights

Docker Compose (Recommended)

Create docker-compose.yml:

version: '3.8'

services:
  cryptosights:
    image: techiepookie/cryptosights:latest
    container_name: cryptosights
    ports:
      - "7860:7860"
    volumes:
      - ./app/data:/app/app/data
    environment:
      - FLASK_ENV=production
      - PYTHONUNBUFFERED=1
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7860/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Deploy:

docker-compose up -d
docker-compose logs -f
docker-compose down

Docker Hub Publishing

# Tag image
docker tag cryptosights:latest techiepookie/cryptosights:latest
docker tag cryptosights:latest techiepookie/cryptosights:v3.0.0

# Login to Docker Hub
docker login

# Push to registry
docker push techiepookie/cryptosights:latest
docker push techiepookie/cryptosights:v3.0.0

Docker Hub Repository: techiepookie/cryptosights

Hugging Face Spaces Deployment

Live Demo: https://techiepookie-cryptosightsai.hf.space

# Clone Space repository
git clone https://huggingface.co/spaces/techiepookie/cryptosightsai
cd cryptosightsai

# Copy application files
cp -r /path/to/crypto-insight-ai/crypto_sights/* .
cp /path/to/crypto-insight-ai/requirements.txt .

# Deploy via Git
git add .
git commit -m "Deploy CryptoSights v3.0"
git push

Hugging Face automatically builds and deploys the Docker container.


πŸ“– Usage Guide

Running the Complete Data Pipeline

Step 1: Scrape & Process Articles

cd crypto_sights/app/scripts
python crypto_scraping_model.py

Pipeline Steps:

  1. Scrapes articles for all cryptocurrencies
  2. Converts articles to PDF format
  3. Extracts and cleans text content
  4. Performs sentiment analysis
  5. Builds BM25 search index
  6. Saves processed data and indexes

Output Files:

  • {coin}_news_urls.csv - Article URLs
  • {coin}_article_{n}.pdf - Downloaded articles
  • cleaned_data.txt - Preprocessed corpus
  • extracted_text.pickle - BM25 index
  • Sentiment charts (PNG)

Starting the Web Application

Option 1: Direct Python

cd crypto_sights
python run.py
# Access: http://localhost:5000

Option 2: Docker

docker run -p 7860:7860 techiepookie/cryptosights:latest
# Access: http://localhost:7860

Web Interface Usage

  1. Access Dashboard β†’ http://localhost:7860
  2. Select Cryptocurrency β†’ Choose from dropdown (Bitcoin, Ethereum, etc.)
  3. Enter Query β†’ Type your question
    • Example: "What is driving Bitcoin's recent price increase?"
  4. View Results β†’ See ranked answers with:
    • Relevance scores
    • Sentiment analysis
    • Source documents
    • Processing time

Command-Line Query Interface

from app.scripts.model import process_query

# Execute a query
results = process_query(
    coin="bitcoin",
    query="What are analysts predicting for Bitcoin in 2025?",
    top_n=5
)

# Access results
print(f"Sentiment: {results['sentiment']}")
for idx, answer in enumerate(results['top_answers'], 1):
    print(f"{idx}. {answer}")

Advanced Usage Examples

Custom Query with BM25:

import pickle
from rank_bm25 import BM25Okapi

# Load BM25 index
with open('app/data/extracted_text.pickle', 'rb') as f:
    corpus, bm25 = pickle.load(f)

# Perform custom query
query = "ethereum smart contract upgrades"
tokenized_query = query.lower().split()
scores = bm25.get_scores(tokenized_query)

# Get top results
top_indices = scores.argsort()[-10:][::-1]
results = [corpus[i] for i in top_indices]

Batch Query Processing:

queries = [
    "Bitcoin price prediction 2025",
    "Ethereum scaling solutions",
    "Solana network performance"
]

for query in queries:
    results = process_query("general", query, top_n=3)
    print(f"\nQuery: {query}")
    print(f"Sentiment: {results['sentiment']}")

πŸ”Œ API Reference

Base URL

http://localhost:7860
http://localhost:5000  # Development

Endpoints

Health Check

Endpoint: GET /health

curl http://localhost:7860/health

Response:

{
  "status": "healthy",
  "timestamp": "2026-01-15T10:30:00Z",
  "version": "3.0.0"
}

Query Processing

Endpoint: POST /query

Request:

curl -X POST http://localhost:7860/query \
  -H "Content-Type: application/json" \
  -d '{
    "coin": "bitcoin",
    "query": "What is the latest trend in Bitcoin price?"
  }'

Parameters:

Parameter Type Required Description
coin string Yes bitcoin, ethereum, solana, dogecoin, hamster, general
query string Yes User question (max 500 chars)

Response (200 OK):

{
  "query": "What is the latest trend in Bitcoin price?",
  "coin": "bitcoin",
  "sentiment": "Positive (Score: 0.65)",
  "top_answers": [
    "Bitcoin prices surged this week due to increased institutional adoption. (Score: 0.92)",
    "Analysts predict Bitcoin will reach new highs in Q2 2025. (Score: 0.85)"
  ],
  "metadata": {
    "processing_time_ms": 145,
    "total_documents_searched": 1247,
    "timestamp": "2026-01-15T10:30:00Z"
  }
}

Response (400 Bad Request):

{
  "error": "Invalid coin parameter",
  "message": "Coin must be one of: bitcoin, ethereum, solana, dogecoin, hamster, general",
  "status": 400
}

Python SDK

import requests

class CryptoSightsClient:
    def __init__(self, base_url="http://localhost:7860"):
        self.base_url = base_url
    
    def query(self, coin, question):
        """Submit a query to CryptoSights"""
        endpoint = f"{self.base_url}/query"
        payload = {"coin": coin, "query": question}
        response = requests.post(endpoint, json=payload)
        return response.json()

# Usage
client = CryptoSightsClient()
results = client.query("bitcoin", "What drives Bitcoin volatility?")
print(results['sentiment'])

πŸŽ“ Evaluation System

The Validation Engine ensures answer quality through ML-based relevance scoring and semantic matching.

Architecture

Input Validation
       ↓
Keyword Matching
       ↓
Semantic Similarity (BERT)
       ↓
TF-IDF Vectorization
       ↓
Random Forest Classifier
       ↓
Valid/Invalid Classification

Components

1. Entry Validation

def is_valid_entry(entry):
    """Multi-stage validation pipeline"""
    # Stage 1: Special cases
    if entry['coin'].lower() == 'general crypto':
        return True
    
    # Stage 2: Basic validation
    if not entry['query'] or not entry['answers']:
        return False
    
    # Stage 3: Keyword matching
    if has_keyword_match(entry):
        return True
    
    # Stage 4: Semantic similarity (threshold: 0.5)
    similarity = compute_semantic_similarity(entry)
    return similarity >= 0.5

2. Semantic Similarity

  • Model: all-MiniLM-L6-v2 (384-dimensional embeddings)
  • Similarity Metric: Cosine similarity
  • Threshold: 0.5

3. Machine Learning

  • Algorithm: Random Forest Classifier
  • Features: TF-IDF vectors (500 features)
  • Cross-Validation: 5-fold CV
  • Performance: F1-score optimization

Usage Example

from evaluation.evalmodel import is_valid_entry

entry = {
    'coin': 'bitcoin',
    'query': 'What factors influence Bitcoin price?',
    'answers': [
        'Supply and demand affect Bitcoin price.',
        'Regulatory news impacts sentiment.'
    ]
}

is_valid = is_valid_entry(entry)
print(f"Valid: {is_valid}")  # True

πŸ“ Project Structure

crypto-insight-ai/
β”œβ”€β”€ cache/                              # Temporary cache
β”œβ”€β”€ crypto_sights/                      # Main package
β”‚   β”œβ”€β”€ app/                            # Flask application
β”‚   β”‚   β”œβ”€β”€ data/                       # Data storage
β”‚   β”‚   β”‚   β”œβ”€β”€ UI-UX/
β”‚   β”‚   β”‚   β”œβ”€β”€ extracted_text.pickle   # BM25 index
β”‚   β”‚   β”‚   β”œβ”€β”€ metrics.json
β”‚   β”‚   β”‚   └── query_logs.json
β”‚   β”‚   β”œβ”€β”€ routes/                     # API endpoints
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   └── main_routes.py
β”‚   β”‚   β”œβ”€β”€ scripts/                    # Processing
β”‚   β”‚   β”‚   β”œβ”€β”€ crypto_scraping_model.py
β”‚   β”‚   β”‚   β”œβ”€β”€ crypto.py
β”‚   β”‚   β”‚   β”œβ”€β”€ keywords.py
β”‚   β”‚   β”‚   β”œβ”€β”€ logger.py
β”‚   β”‚   β”‚   β”œβ”€β”€ model.py
β”‚   β”‚   β”‚   └── process_and_push.py
β”‚   β”‚   β”œβ”€β”€ static/                     # Web assets
β”‚   β”‚   β”‚   β”œβ”€β”€ assets/
β”‚   β”‚   β”‚   β”œβ”€β”€ css/
β”‚   β”‚   β”‚   └── js/
β”‚   β”‚   β”œβ”€β”€ templates/                  # HTML templates
β”‚   β”‚   β”‚   β”œβ”€β”€ index.html
β”‚   β”‚   β”‚   β”œβ”€β”€ evaluation.html
β”‚   β”‚   β”‚   β”œβ”€β”€ faq.html
β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   └── __init__.py
β”‚   β”œβ”€β”€ evaluation/                     # Validation engine
β”‚   β”‚   β”œβ”€β”€ evalbackend.py
β”‚   β”‚   └── evalmodel.py
β”‚   β”œβ”€β”€ Dockerfile                      # Container config
β”‚   β”œβ”€β”€ deployment.md                   # Deployment guide
β”‚   └── run.py                          # Entry point
β”œβ”€β”€ .dockerignore
β”œβ”€β”€ .gitignore
β”œβ”€β”€ contributors.md
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
└── LICENSE

πŸ’» Development

Setup Development Environment

# Clone repository
git clone https://github.com/techiepookie/crypto-insight-ai.git
cd crypto-insight-ai

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install development dependencies (optional)
pip install pytest pytest-cov black flake8 pylint

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=crypto_sights

# Run specific test
pytest tests/test_model.py

Code Quality

# Format code
black crypto_sights/

# Lint
flake8 crypto_sights/ --max-line-length=100

# Type checking
mypy crypto_sights/

Commit Message Convention

Follow Conventional Commits:

feat: add new feature description
fix: resolve bug issue
docs: update documentation
refactor: improve code structure
test: add test cases

🀝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

How to Contribute

  1. Fork the repository
  2. Create feature branch - git checkout -b feature/your-feature
  3. Commit changes - git commit -m "feat: description"
  4. Push to fork - git push origin feature/your-feature
  5. Create Pull Request

Code Style

  • Follow PEP 8
  • Add docstrings to functions
  • Include type hints
  • Write unit tests
  • Maintain >80% code coverage

πŸ”§ Troubleshooting

Common Issues

Q: "ModuleNotFoundError: No module named 'flask'"

pip install -r requirements.txt

Q: "wkhtmltopdf not found"

# Ubuntu
sudo apt-get install wkhtmltopdf

# macOS
brew install wkhtmltopdf

Q: "NLTK resources not downloaded"

python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"

Q: "Port 7860 already in use"

# Find process using port
lsof -i :7860

# Kill process or use different port
docker run -p 8080:7860 techiepookie/cryptosights:latest

Q: "Out of memory errors"

# Increase Docker memory allocation in Desktop settings
# or use volume mounting for persistent storage
docker run -v /path/to/data:/app/data techiepookie/cryptosights:latest

Getting Help


πŸ“„ License

This project is licensed under the MIT License - see LICENSE for details.

Copyright (c) 2025 CryptoSights Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files...

πŸ“ž Contact & Support

Connect

Support the Project

⭐ Star the repository | πŸ› Report bugs | πŸ’‘ Suggest features | 🀝 Contribute code


πŸ™ Acknowledgments

Team

  • Lead Developer: Harshal Chaudhari & Nikhil Kumar Obhawani
  • Contributors: See contributors.md

Technologies

Flask β€’ NLTK β€’ TextBlob β€’ Scikit-learn β€’ Pandas β€’ Docker β€’ Hugging Face


Built with ❀️ by the CryptoSights Team

⬆ Back to Top

About

CryptoSights is an AI-powered crypto analysis platform using NLP, sentiment analysis, and ML.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors