Refactoring for Ollama Integration
=======================================================================

The goal of this excercise is to refactor the implementations from the prompting excercises to integrate Ollama while retaining the existing OpenAI API calls. This exercise will allow us to compare responses between OpenAI and a local LLM running on Ollama, helping us evaluate the feasibility of using local models and gain hands-on experience working with them.


## Step 0: Setup and Dependencies
--------------------------------
First, let's ensure we have all required packages installed.

In [2]:
!pip install numpy pandas matplotlib langchain openai python-dotenv typing-extensions pydantic pydantic_settings langchain-community langchain-openai --quiet

In [None]:
# Install Ollama
!pip install ollama

## Step 1: Initial Configuration
--------------------------------
Set up our environment and imports.

In [38]:
from typing import Any, Dict, List
from langchain.chat_models import ChatOpenAI
import ollama
import requests

#### The next class is added to send prompts to our Ollama LLM running in localhost:

In [39]:
class OllamaClient:
    """
    A client for interacting with the Ollama API to generate responses based on a given prompt.

    Attributes:
        model (str): The model to use for generating responses. Default is "llama3.2".
        stream (bool): Flag to determine whether to use streaming for responses. Default is False.

    Methods:
        __call__(prompt):
            Sends a prompt to the Ollama API and returns the complete response.
            If streaming is enabled, concatenates the response lines.
        
            Args:
                prompt (str): The input prompt to send to the Ollama API.
            
            Returns:
                str: The complete response from the Ollama API.
    """
    def __init__(self, model="llama3.2", stream=False):
        self.model = model
        self.stream = stream  # Allows choosing whether to use streaming or not

    def __call__(self, prompt):
        """Sends a prompt to Ollama and returns the complete response"""
        response = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": self.model, "prompt": prompt, "stream": self.stream}
        )

        if not self.stream:  # If streaming is disabled, return the response as is
            return response.json().get("response", "")

        # If streaming is enabled, concatenate the response lines
        full_response = ""
        for line in response.iter_lines():
            if line:
                try:
                    data = requests.utils.json.loads(line)
                    full_response += data.get("response", "")
                except Exception as e:
                    print("\nError al procesar una línea:", e)

        return full_response


## Step 1.5: Configuration Management
--------------------------------
Set up configuration management for Ollama. In this part we set our LLM in a similar way as we did with OpenAI in the prompting exercises

In [40]:
import os
from typing import Optional
from pydantic_settings import BaseSettings
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

class Settings(BaseSettings):
    """
    Settings class to manage configuration variables for the application.

    This class loads environment variables from a .env file and provides
    access to these variables as class attributes.

    Attributes:
        ollama_model_name (str): The name of the Ollama model, loaded from the
                                 environment variable 'OLLAMA_MODEL_NAME'.
    """

    # Load variables from .env
    load_dotenv()

    ollama_model_name: str = os.getenv('OLLAMA_MODEL_NAME')

In [41]:
def setup_environment() -> OllamaClient:
    """Initialize environment and create LLM instance.

    This function:
    1. Loads settings
    2. Sets environment variables
    3. Initializes chat model

    Returns:
        Ollama: Configured language model instance
    """
    # Load settings
    settings = Settings()

    # Initialize ChatOpenAI with settings
    llm = OllamaClient(
        model=settings.ollama_model_name
    )

    return llm

In [42]:
# Initialize LLM
try:
    llm = setup_environment()
except Exception as e:
    print(f"Error initializing LLM: {e}")

In [46]:
import time

# Test LLM with a prompt
start_time = time.time()
response = llm(f"What LLM are you?")
end_time = time.time()

print("Response:", response)
print("Execution Time:", end_time - start_time, "seconds")


Response: I am a text-based conversational AI model, specifically a large language model (LLM). My architecture is based on transformer models, which use self-attention mechanisms to process and generate human-like text.

My exact model is a variant of the BART (Bidirectional Architecture for Task-oriented Response generation) model, which is a popular LLM designed for natural language processing tasks such as text generation, question answering, and conversation.

I was trained on a massive dataset of text from various sources, including books, articles, and online content. This training allows me to generate human-like responses to a wide range of questions and topics.

However, I'd like to clarify that I'm not a single, specific model, but rather a general term for a type of LLM designed for conversational AI tasks. There are many other LLMs out there, each with their own strengths and weaknesses, and I'm constantly learning and improving my abilities based on the interactions I hav

## Problem: Market Sentiment Analysis System
--------------------------------
Design and implement a comprehensive market sentiment analysis system
that combines multiple data sources and ensures consistency.

### Requirements:
1. Multi-Source Analysis:
   - News articles and headlines
   - Social media sentiment
   - Technical indicators
   - Market statistics
   - Analyst reports

2. Self-Consistency Checks:
   - Cross-validation of sources
   - Internal consistency metrics
   - Temporal consistency
   - Source reliability scoring

3. Confidence Scoring:
   - Source-specific confidence
   - Analysis reliability metrics
   - Consensus confidence
   - Time-sensitivity factors

In [47]:
#Ollama
from typing import Dict, List, Any
from collections import defaultdict
import numpy as np

class MarketSentimentAnalyzer:
    """A system for comprehensive market sentiment analysis."""

    def __init__(self, llm: Any):
        """Initialize sentiment analysis system."""
        self.llm = llm
        self.source_weights = {
            "news_articles": 0.3,
            "social_media_sentiment": 0.25,
            "technical_indicators": 0.2,
            "market_statistics": 0.15,
            "analyst_reports": 0.1,
        }
        self.source_reliability = defaultdict(lambda: 0.8)  # Default reliability score
        self.temporal_decay_factor = 0.95  # Decay factor for older data

    def analyze_sentiment(self, market_data: Dict[str, str]) -> Dict[str, Any]:
        """Analyze market sentiment from multiple sources."""
        analyses = {}
        for source, content in market_data.items():
            if source == "news_articles":
                analyses[source] = self._analyze_news(content)
            elif source == "social_media_sentiment":
                analyses[source] = self._analyze_social_media(content)
            elif source == "technical_indicators":
                analyses[source] = self._analyze_technical_indicators(content)
            elif source == "market_statistics":
                analyses[source] = self._analyze_market_statistics(content)
            elif source == "analyst_reports":
                analyses[source] = self._analyze_analyst_reports(content)
            else:
                raise ValueError(f"Unknown data source: {source}")

        # Assign confidence scores
        for source, analysis in analyses.items():
            analysis["confidence"] = self._calculate_source_confidence(source, analysis)

        return analyses

    def check_consistency(self, analyses: List[Dict[str, Any]]) -> float:
        """Check consistency between different analyses."""
        sentiment_scores = []
        for analysis in analyses.values():
            sentiment_scores.append(analysis["sentiment_score"])

        # Calculate consistency as the inverse of standard deviation
        consistency = 1 / (np.std(sentiment_scores) + 1e-9)  # Avoid division by zero
        return min(consistency, 1.0)  # Cap consistency at 1.0

    def generate_consensus(self, analyses: List[Dict[str, Any]]) -> Dict[str, Any]:
        """Generate weighted consensus with confidence scores."""
        weighted_sentiment = 0
        total_weight = 0
        for source, analysis in analyses.items():
            weight = self.source_weights[source] * analysis["confidence"]
            weighted_sentiment += analysis["sentiment_score"] * weight
            total_weight += weight

        consensus_sentiment = weighted_sentiment / total_weight
        consistency_score = self.check_consistency(analyses)

        return {
            "consensus_sentiment": consensus_sentiment,
            "consistency_score": consistency_score,
            "confidence": total_weight / sum(self.source_weights.values()),
        }

    def _analyze_news(self, content: str) -> Dict[str, Any]:
        """Analyze sentiment from news articles."""
        # Use LLM to extract sentiment
        response = self.llm(f"Analyze the sentiment of the following news articles:\n{content}")
        sentiment_score = self._extract_sentiment_score(response)  # Extract text from AIMessage
        return {"sentiment_score": sentiment_score, "source": "news_articles"}

    def _analyze_social_media(self, content: str) -> Dict[str, Any]:
        """Analyze sentiment from social media."""
        # Initialize default values
        positive_mentions = 0.0
        negative_mentions = 0.0

        # Extract positive mentions
        if "positive mentions" in content:
            try:
                positive_mentions = float(content.split("positive mentions")[1].split("%")[0]) / 100
            except (IndexError, ValueError):
                pass  # Use default value if parsing fails

        # Extract negative mentions
        if "negative mentions" in content:
            try:
                negative_mentions = float(content.split("negative mentions")[1].split("%")[0]) / 100
            except (IndexError, ValueError):
                pass  # Use default value if parsing fails

        # Calculate sentiment score
        sentiment_score = positive_mentions - negative_mentions
        return {"sentiment_score": sentiment_score, "source": "social_media_sentiment"}

    def _analyze_technical_indicators(self, content: str) -> Dict[str, Any]:
        """Analyze sentiment from technical indicators."""
        # Extract RSI and VIX
        rsi = 50.0  # Default value
        vix = 20.0  # Default value

        if "RSI:" in content:
            try:
                rsi = float(content.split("RSI:")[1].split("\n")[0])
            except (IndexError, ValueError):
                pass  # Use default value if parsing fails

        if "VIX:" in content:
            try:
                vix = float(content.split("VIX:")[1].split("\n")[0])
            except (IndexError, ValueError):
                pass  # Use default value if parsing fails

        # Calculate sentiment score
        sentiment_score = (rsi - 50) / 50 - (vix - 20) / 20  # Normalize and combine
        return {"sentiment_score": sentiment_score, "source": "technical_indicators"}

    def _analyze_market_statistics(self, content: str) -> Dict[str, Any]:
        """Analyze sentiment from market statistics."""
        # Extract NASDAQ performance
        nasdaq_change = 0.0  # Default value

        if "NASDAQ:" in content:
            try:
                nasdaq_change = float(content.split("NASDAQ:")[1].split("%")[0]) / 100
            except (IndexError, ValueError):
                pass  # Use default value if parsing fails

        # Use NASDAQ as a proxy for tech sentiment
        sentiment_score = nasdaq_change
        return {"sentiment_score": sentiment_score, "source": "market_statistics"}

    def _analyze_analyst_reports(self, content: str) -> Dict[str, Any]:
        """Analyze sentiment from analyst reports."""
        # Use LLM to extract sentiment
        response = self.llm(f"Analyze the sentiment of the following analyst reports:\n{content}")
        sentiment_score = self._extract_sentiment_score(response)  # Extract text from AIMessage
        return {"sentiment_score": sentiment_score, "source": "analyst_reports"}

    def _calculate_source_confidence(self, source: str, analysis: Dict[str, Any]) -> float:
        """Calculate confidence score for a source."""
        base_confidence = self.source_reliability[source]
        temporal_decay = self.temporal_decay_factor  # Adjust based on data freshness
        return base_confidence * temporal_decay

    def _extract_sentiment_score(self, text: str) -> float:
        """
        Extract a sentiment score from LLM response.

        This function sends a text to an LLM to analyze its sentiment and returns a score between -1 and 1.

        Parameters:
        - text (str): The text to analyze for sentiment.

        Returns:
        - float: A sentiment score between -1 and 1.

        Raises:
        - ValueError: If the input text is not a string or if the LLM response is not a valid float between -1 and 1.
        """
        if not isinstance(text, str):
            raise ValueError("Input text must be a string.")

        try:
            response = llm(f"Analyze the sentiment of the following text:\n{text},\
                            please provide only a score between -1 and 1\
                            the score should be a float number\
                            -Answer most be only a float number between -1 and 1\
                            -Force the answer to be like the next example: Score=0.5")

            # Parse the response to extract the sentiment score
            score = float(response.split("=")[1])     

            if not (-1 <= score <= 1):
                raise ValueError("Sentiment score must be between -1 and 1.")

            return score

        except Exception as e:
            raise ValueError(f"Error in extracting sentiment score: {e}")


### Example Test Data:

In [48]:
market_data = {
    "news_articles": """
HEADLINE: Tech Stocks Rally on Strong Earnings
(Reuters) - Technology stocks surged today following better-than-expected
earnings from major players. Apple Inc. and Microsoft Corp. both beat analyst
estimates, driving broader market gains. Cloud computing and AI segments
showed particular strength.

HEADLINE: Fed Signals Potential Rate Cuts
The Federal Reserve indicated openness to rate cuts later this year, citing
moderating inflation pressures. Markets responded positively to the news,
with bond yields declining.

HEADLINE: Startup Funding Shows Signs of Recovery
Venture capital investments increased 15% in Q4, marking the first
quarterly rise since 2022. Software and fintech sectors led the recovery.
""",

    "social_media_sentiment": """
$AAPL trending positive:
- 65% positive mentions
- 28% neutral mentions
- 7% negative mentions
Volume: 50,000 mentions

$MSFT sentiment metrics:
- 72% positive mentions
- 22% neutral mentions
- 6% negative mentions
Volume: 35,000 mentions

#TechStocks trending topics:
1. #EarningsSeason
2. #TechRally
3. #InvestInTech
""",

    "technical_indicators": """
Market Technical Analysis:
- S&P 500 RSI: 62.5
- NASDAQ RSI: 65.8
- VIX: 16.5
- Moving Averages: Most above 200-day
- Volume: +25% vs 30-day average
- Advance/Decline: 2.5:1
""",

    "market_statistics": """
Market Overview:
- S&P 500: +1.2%
- NASDAQ: +1.8%
- DOW: +0.9%
- Small Caps: +1.5%
- Sector Leaders: Tech +2.3%, Communications +1.9%
- Sector Laggards: Utilities -0.4%, Real Estate -0.2%
""",

    "analyst_reports": """
Goldman Sachs: Overweight Tech Sector
- Target price revisions: +10% average
- Sector outlook: Positive
- Key drivers: AI adoption, cloud growth
- Risk factors: Valuations, rate sensitivity

Morgan Stanley: Market Outlook
- Stance: Constructively bullish
- Focus areas: Quality growth stocks
- Concerns: Geopolitical tensions
- 12-month S&P target: 5200
"""
}

In [49]:
# Initialize the analyzer with an LLM instance
analyzer = MarketSentimentAnalyzer(llm)

analyzer_start_time = time.time()
analyses = analyzer.analyze_sentiment(market_data)
analyzer_end_time = time.time()

# Check consistency
consistency_start_time = time.time()
consistency_score = analyzer.check_consistency(analyses)
consistency_end_time = time.time()

# Generate consensus
consensus_start_time = time.time()
consensus = analyzer.generate_consensus(analyses)
consensus_end_time = time.time()

print("Analyzer Check Time:", analyzer_end_time - analyzer_start_time, "seconds")
print("Consensus Sentiment:", consensus["consensus_sentiment"])
print("Consensus Check Time:", consensus_end_time - consensus_start_time, "seconds")
print("Consistency Score:", consensus["consistency_score"])
print("Consistency Check Time:", consistency_end_time - consistency_start_time, "seconds")
print("Confidence:", consensus["confidence"])


Analyzer Check Time: 249.7772490978241 seconds
Consensus Sentiment: 0.38769999999999993
Consensus Check Time: 0.000308990478515625 seconds
Consistency Score: 1.0
Consistency Check Time: 0.03458714485168457 seconds
Confidence: 0.76
