# Final Report: AI Project Analysis and Results

## Executive Summary
This report analyzes the results of two machine learning projects:
1. **Medical Image Classification** (X-ray analysis for pneumonia detection)
2. **Financial Sentiment Analysis** (Market sentiment classification)

Both projects demonstrate the application of deep learning techniques to real-world problems, with varying levels of success and insights gained.

---

## Project 1: Medical Image Classification

### Overview
- **Objective:** Classify chest X-rays into Normal and Pneumonic categories.
- **Dataset:** 5,856 chest X-ray images.
- **Models Implemented:**
  - **Model 1:** Simple CNN architecture.
  - **Model 2:** Deep CNN with batch normalization.

### Results
#### Model 1 (Simple CNN):
- **Accuracy:** 84.74%
- **Precision (Normal):** 77%
- **Recall (Normal):** 59%
- **F1-Score (Pneumonic):** 90%

#### Model 2 (Deep CNN):
- **Accuracy:** 79.05%
- **Precision (Normal):** 58%
- **Recall (Normal):** 74%
- **F1-Score (Pneumonic):** 85%

### Key Insights
- Simpler architecture (Model 1) performed better overall.
- Class imbalance affected model performance, particularly for the Normal class.
- Data augmentation proved crucial for improving generalization and robustness.

---


## Project 2: Financial Sentiment Analysis

### Overview
- **Objective:** Classify financial texts into Bearish, Bullish, or Neutral sentiments.
- **Dataset:** 11,931 financial text samples.
- **Models Implemented:**
  - **Model 1:** CNN with word embeddings.
  - **Model 2:** Bidirectional LSTM.

### Results
#### Model 1 (CNN):
- **Overall Accuracy:** 78.31%
- **Best Performance:** Neutral class (F1-score: 86%).
- **Strength:** Strong Bullish sentiment detection.

#### Model 2 (Bi-LSTM):
- **Overall Accuracy:** 76.05%
- **Consistent Performance:** Neutral class (F1-score: 85%).
- **Strength:** Better at detecting Bearish sentiments.

### Key Insights
- Class imbalance significantly impacted performance on Bearish and Bullish classes.
- Both models excelled at Neutral sentiment detection.
- Text preprocessing and tokenization were critical for achieving high performance.

---

## Technical Implementation Details

### Data Processing
#### Image Classification:
- Standardized image resolution to **224x224**.
- Applied data augmentation (rotation, flipping, scaling).
- Split data into **70% Train, 10% Validation, 20% Test**.

#### Sentiment Analysis:
- Cleaned and tokenized text data.
- Used word embeddings with a dimension of **100**.
- Split data into **80% Train, 10% Validation, 10% Test**.

---

### Model Architectures
#### Medical Images:
- CNN with dropout and batch normalization.
- Multiple convolutional layers with max pooling.

#### Sentiment Analysis:
- Word embeddings layer for sequence encoding.
- CNN and Bi-LSTM implementations.
- Dropout layers for regularization.

---


## Challenges and Solutions

### Challenges Encountered
- Class imbalance in both projects.
- Limited dataset size for certain classes.
- Overfitting in complex models.
- Resource constraints during training.

### Solutions Implemented
- Applied data augmentation techniques.
- Used class weighting in loss functions.
- Added dropout and regularization layers.
- Implemented early stopping to prevent overfitting.

---

## Recommendations

### Medical Image Classification:
- Collect a more balanced dataset for better class performance.
- Implement ensemble methods for improved accuracy.
- Explore transfer learning with pretrained models.
- Add explainability features (e.g., attention maps).

### Sentiment Analysis:
- Address class imbalance through additional data collection or synthetic sampling.
- Implement advanced NLP models like BERT or transformers.
- Add domain-specific preprocessing for financial text.
- Develop confidence thresholds for predictions to handle uncertainty.

---

## Future Improvements

### Technical Enhancements
- Implement cross-validation to evaluate model robustness.
- Explore hybrid architectures combining CNNs and RNNs.
- Add real-time prediction capabilities for deployment.
- Improve model interpretability using explainable AI techniques.

---

## Conclusion
Both projects demonstrated the potential and limitations of deep learning in specialized domains. The following insights were key:

1. Simpler models often performed better, emphasizing the importance of proper architecture selection.
2. Data preparation (augmentation, tokenization) had a significant impact on performance.
3. Class imbalance remains a critical challenge, especially for underrepresented categories.

### Comparative Analysis:
- The **medical image classification project** achieved higher accuracy, likely due to the structured nature of image data.
- The **sentiment analysis project** showed promise but requires further work on class imbalance and feature engineering.