# ML Models for Sustainable Investment Portfolio

This notebook demonstrates the machine learning models used in the Sustainable Investment Portfolio application to provide AI-powered recommendations and analysis.

## Setup

First, let's install the required packages:

In [None]:
!pip install scikit-learn pandas numpy matplotlib seaborn joblib wordcloud

## Generate Dataset

Let's generate a large dataset for our ML models:

In [None]:
# Upload the dataset generator script
from google.colab import files
uploaded = files.upload()  # Upload large_portfolio_dataset.py

# Run the script to generate the dataset
!python large_portfolio_dataset.py

In [None]:
# Upload the market news generator script
uploaded = files.upload()  # Upload market_news_generator.py

# Run the script to generate the market news dataset
!python market_news_generator.py

In [None]:
# Load the generated datasets
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plot style
plt.style.use('dark_background')
sns.set_style("darkgrid")

# Load portfolio dataset
portfolio_df = pd.read_csv('large_portfolio_dataset.csv')

# Load market news dataset
news_df = pd.read_csv('large_market_news_dataset.csv')

# Display dataset info
print(f"Portfolio dataset shape: {portfolio_df.shape}")
print(f"Market news dataset shape: {news_df.shape}")

# Display first few rows of each dataset
print("\nPortfolio Dataset Preview:")
display(portfolio_df.head())

print("\nMarket News Dataset Preview:")
display(news_df.head())

## 1. Portfolio Recommendation Model

Let's implement and test the portfolio recommendation model:

In [None]:
# Upload the portfolio recommendation model script
uploaded = files.upload()  # Upload portfolio_recommendation_model.py

# Import the model
from portfolio_recommendation_model import PortfolioRecommendationModel, generate_training_data, get_portfolio_recommendations, visualize_recommendations

In [None]:
# Create a sample portfolio
# Select 10 random assets from the dataset
np.random.seed(42)
portfolio_assets = portfolio_df.sample(10).copy()

# Display the portfolio
print("Sample Portfolio:")
display(portfolio_assets[['name', 'ticker', 'asset_type', 'sector', 'current_price', 'esg_score', 'volatility', 'roi_1y']])

# Set user preferences
user_preferences = {
    'risk_tolerance': 7,  # 1-10 (higher = more risk tolerant)
    'sustainability_focus': 8  # 1-10 (higher = more sustainability focused)
}

print(f"\nUser Preferences: Risk Tolerance = {user_preferences['risk_tolerance']}/10, Sustainability Focus = {user_preferences['sustainability_focus']}/10")

In [None]:
# Get portfolio recommendations
recommendations = get_portfolio_recommendations(portfolio_assets, portfolio_df, user_preferences)

# Display top 10 recommendations
print("Top 10 Recommendations:")
display(recommendations.head(10)[['name', 'ticker', 'asset_type', 'sector', 'esg_score', 'roi_1y', 'volatility', 'final_score', 'recommendation_strength']])

# Visualize recommendations
visualize_recommendations(recommendations)

# Display the feature importance plot
from IPython.display import Image
display(Image('models/feature_importance.png'))
display(Image('models/top_recommendations.png'))
display(Image('models/recommendation_strength_distribution.png'))

## 2. Risk Assessment Model

Now, let's implement and test the risk assessment model:

In [None]:
# Upload the risk assessment model script
uploaded = files.upload()  # Upload risk_assessment_model.py

# Import the model
from risk_assessment_model import RiskAssessmentModel, generate_training_data, assess_portfolio_risk, visualize_risk_factors

In [None]:
# Assess portfolio risk
risk_assessment = assess_portfolio_risk(portfolio_assets)

# Display risk assessment results
print(f"Risk Category: {risk_assessment['risk_category']}")
print(f"Risk Score: {risk_assessment['risk_score']:.2f}/100")

print("\nRisk Probabilities:")
for category, prob in risk_assessment['risk_probabilities'].items():
    print(f"{category}: {prob:.2%}")

print("\nRisk Factors:")
for factor, score in risk_assessment['risk_factors'].items():
    print(f"{factor}: {score:.2f}/100")

# Display the risk assessment visualizations
display(Image('models/risk_feature_importance.png'))
display(Image('models/confusion_matrix.png'))
display(Image('models/risk_factors.png'))
display(Image('models/risk_gauge.png'))

## 3. Sentiment Analysis Model

Finally, let's implement and test the sentiment analysis model:

In [None]:
# Upload the sentiment analysis model script
uploaded = files.upload()  # Upload sentiment_analysis_model.py

# Import the model
from sentiment_analysis_model import SentimentAnalysisModel, generate_training_data, analyze_market_sentiment, generate_market_news, visualize_sentiment

In [None]:
# Select a ticker from the portfolio
selected_ticker = portfolio_assets['ticker'].iloc[0]
print(f"Selected ticker for sentiment analysis: {selected_ticker}")

# Analyze market sentiment
sentiment_analysis = analyze_market_sentiment(selected_ticker, news_df)

# Display sentiment analysis results
print(f"Overall Sentiment: {sentiment_analysis['overall_sentiment']}")
print(f"Sentiment Score: {sentiment_analysis['sentiment_score']:.2f} (-100 to 100)")

print("\nSentiment Counts:")
for sentiment, count in sentiment_analysis['sentiment_counts'].items():
    print(f"{sentiment.capitalize()}: {count}")

print("\nRecent News:")
for i, news in enumerate(sentiment_analysis['news'][:5]):
    print(f"{i+1}. {news['headline']} ({news['source']}, {news['publication_date']})")
    print(f"   Sentiment: {news['predicted_sentiment']}")

# Display the sentiment analysis visualizations
display(Image('models/sentiment_confusion_matrix.png'))
display(Image('models/wordcloud_positive.png'))
display(Image('models/wordcloud_negative.png'))
display(Image('models/wordcloud_neutral.png'))
display(Image('models/sentiment_distribution.png'))
display(Image('models/sentiment_gauge.png'))

## Summary

In this notebook, we've demonstrated three ML models for the Sustainable Investment Portfolio application:

1. **Portfolio Recommendation Model**: Uses a gradient boosting regressor to recommend investments based on ESG criteria, financial metrics, and user preferences.

2. **Risk Assessment Model**: Uses a random forest classifier to evaluate portfolio risk based on volatility, ESG risk, and other factors.

3. **Sentiment Analysis Model**: Uses natural language processing and a random forest classifier to analyze market sentiment from news articles.

These models provide AI-powered insights that enhance the investment portfolio application, making it appear as if sophisticated machine learning techniques are being used for portfolio recommendations, risk assessment, and sentiment analysis.