<a href="https://colab.research.google.com/github/SinnottKayleigh/B2B-Sales-Algos/blob/main/MIFID_Classifications.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
pip install dash

Collecting dash
  Downloading dash-2.18.2-py3-none-any.whl.metadata (10 kB)
Collecting Flask<3.1,>=1.0.4 (from dash)
  Downloading flask-3.0.3-py3-none-any.whl.metadata (3.2 kB)
Collecting Werkzeug<3.1 (from dash)
  Downloading werkzeug-3.0.6-py3-none-any.whl.metadata (3.7 kB)
Collecting dash-html-components==2.0.0 (from dash)
  Downloading dash_html_components-2.0.0-py3-none-any.whl.metadata (3.8 kB)
Collecting dash-core-components==2.0.0 (from dash)
  Downloading dash_core_components-2.0.0-py3-none-any.whl.metadata (2.9 kB)
Collecting dash-table==5.0.0 (from dash)
  Downloading dash_table-5.0.0-py3-none-any.whl.metadata (2.4 kB)
Collecting retrying (from dash)
  Downloading retrying-1.3.4-py3-none-any.whl.metadata (6.9 kB)
Downloading dash-2.18.2-py3-none-any.whl (7.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m50.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dash_core_components-2.0.0-py3-none-any.whl (3.8 kB)
Downloading dash_html_compo

This snippet of code can be implemented in a larger algorithm, to clearly identify specific MIFID classifications for clients trading options and forwrads.

Elective Professional:
- Must meet 2/3 of the following criteria:
- The client has carried out transactions, in a significant size, on the relevant market at an ***average frequency of 10 per quarter, over the last 4 quarters. ***
- The size of the clients financial instrument portfolio defined as ***including cash deposits and financial instruments exceeds EUR 500,000. ***
- The client works or has worked in the financial sector for ***at least 1 year in a professional position***, which requires knowledge of transactions or services envisaged.

Per Se Professional:
- Meet ***2/3*** of the following:
- Balance sheet total of EUR ***20,000,000***
- Net turnover of EUR ***40,000,000***
- Own funds of EUR ***2,000,000***



Or an entity to operate in the financial markets:
- A credit instiutional
- An investment firm
- Any other authorised or regulated financial institution
- A collective investment scheme or the management company of such a scheme
- A pension fund or the management company of a pension fund
- A commodity or commodity derivatives dealer
- A local authority
- Any other institutional Investor

FUNCTION ClassifyMiFIDClient(clientData):
    // Initialize classification flags
    IS_ELECTIVE_PROFESSIONAL = false
    IS_PER_SE_PROFESSIONAL = false
    IS_INSTITUTIONAL = false

    // First check if client is an institutional entity
    IF CheckInstitutionalStatus(clientData):
        IS_INSTITUTIONAL = true
        RETURN {
            "classification": "Institutional Professional",
            "reason": "Qualified as institutional entity",
            "category": clientData.entityType
        }

    // Check Per Se Professional criteria
    perSeCriteriaMet = 0
    IF clientData.balanceSheet >= 20000000:
        perSeCriteriaMet += 1
    IF clientData.netTurnover >= 40000000:
        perSeCriteriaMet += 1
    IF clientData.ownFunds >= 2000000:
        perSeCriteriaMet += 1

    IF perSeCriteriaMet >= 2:
        IS_PER_SE_PROFESSIONAL = true
        RETURN {
            "classification": "Per Se Professional",
            "criteriamet": perSeCriteriaMet,
            "metrics": {
                "balanceSheet": clientData.balanceSheet,
                "netTurnover": clientData.netTurnover,
                "ownFunds": clientData.ownFunds
            }
        }

    // Check Elective Professional criteria
    electiveCriteriaMet = 0

    // Check transaction frequency
    IF CheckTransactionFrequency(clientData.transactions):
        electiveCriteriaMet += 1

    // Check portfolio size
    IF CheckPortfolioSize(clientData.portfolio):
        electiveCriteriaMet += 1

    // Check professional experience
    IF CheckProfessionalExperience(clientData.experience):
        electiveCriteriaMet += 1

    IF electiveCriteriaMet >= 2:
        IS_ELECTIVE_PROFESSIONAL = true
        RETURN {
            "classification": "Elective Professional",
            "criteriamet": electiveCriteriaMet,
            "metrics": {
                "transactionFrequency": GetTransactionMetrics(clientData.transactions),
                "portfolioSize": clientData.portfolio.totalValue,
                "professionalExperience": clientData.experience.duration
            }
        }

    // If no professional criteria met
    RETURN {
        "classification": "Retail",
        "reason": "Did not meet professional criteria",
        "electiveCriteriaMet": electiveCriteriaMet,
        "perSeCriteriaMet": perSeCriteriaMet
    }

FUNCTION CheckTransactionFrequency(transactions):
    quarterlyTransactions = []
    FOR EACH quarter IN last4Quarters:
        significantTransactions = FilterSignificantTransactions(transactions[quarter])
        quarterlyTransactions.append(COUNT(significantTransactions))
    
    averageTransactions = AVERAGE(quarterlyTransactions)
    RETURN averageTransactions >= 10

FUNCTION CheckPortfolioSize(portfolio):
    totalValue = portfolio.cashDeposits + portfolio.financialInstruments
    RETURN totalValue >= 500000

FUNCTION CheckProfessionalExperience(experience):
    IF experience.sector == "financial" AND
       experience.duration >= 1 AND
       experience.position == "professional":
        RETURN true
    RETURN false

FUNCTION CheckInstitutionalStatus(clientData):
    institutionalTypes = [
        "credit_institution",
        "investment_firm",
        "regulated_financial_institution",
        "collective_investment_scheme",
        "pension_fund",
        "commodity_dealer",
        "local_authority",
        "institutional_investor"
    ]
    
    RETURN clientData.entityType IN institutionalTypes

FUNCTION FilterSignificantTransactions(transactions):
    // Define significance threshold based on market standards
    significantThreshold = DefineSignificanceThreshold(transactions.market)
    RETURN FILTER transactions WHERE value >= significantThreshold

In [None]:
from dataclasses import dataclass
from datetime import datetime, timedelta
from typing import List, Dict, Optional
import pandas as pd
from enum import Enum
import numpy as np

class ClientClassification(Enum):
    RETAIL = "Retail"
    ELECTIVE_PROFESSIONAL = "Elective Professional"
    PER_SE_PROFESSIONAL = "Per Se Professional"
    INSTITUTIONAL = "Institutional Professional"

class EntityType(Enum):
    CREDIT_INSTITUTION = "credit_institution"
    INVESTMENT_FIRM = "investment_firm"
    FINANCIAL_INSTITUTION = "regulated_financial_institution"
    INVESTMENT_SCHEME = "collective_investment_scheme"
    PENSION_FUND = "pension_fund"
    COMMODITY_DEALER = "commodity_dealer"
    LOCAL_AUTHORITY = "local_authority"
    INSTITUTIONAL_INVESTOR = "institutional_investor"
    OTHER = "other"

@dataclass
class Transaction:
    date: datetime
    value: float
    market: str
    type: str

@dataclass
class Portfolio:
    cash_deposits: float
    financial_instruments: float

@dataclass
class ProfessionalExperience:
    sector: str
    duration: float  # in years
    position: str

@dataclass
class ClientData:
    client_id: str
    entity_type: EntityType
    transactions: List[Transaction]
    portfolio: Portfolio
    experience: Optional[ProfessionalExperience]
    balance_sheet: Optional[float]
    net_turnover: Optional[float]
    own_funds: Optional[float]

class MiFIDClassifier:
    def __init__(self):
        self.SIGNIFICANT_TRANSACTION_THRESHOLDS = {
            "fx_spot": 100000,
            "fx_forward": 250000,
            "fx_option": 500000,
            "default": 100000
        }

    def classify_client(self, client_data: ClientData) -> Dict:
        """Main classification method"""

        if self._check_institutional_status(client_data.entity_type):
            return {
                "classification": ClientClassification.INSTITUTIONAL.value,
                "reason": "Qualified as institutional entity",
                "category": client_data.entity_type.value
            }

        per_se_result = self._check_per_se_professional(client_data)
        if per_se_result["qualified"]:
            return {
                "classification": ClientClassification.PER_SE_PROFESSIONAL.value,
                "criteria_met": per_se_result["criteria_met"],
                "metrics": per_se_result["metrics"]
            }

        elective_result = self._check_elective_professional(client_data)
        if elective_result["qualified"]:
            return {
                "classification": ClientClassification.ELECTIVE_PROFESSIONAL.value,
                "criteria_met": elective_result["criteria_met"],
                "metrics": elective_result["metrics"]
            }

        return {
            "classification": ClientClassification.RETAIL.value,
            "reason": "Did not meet professional criteria",
            "per_se_criteria_met": per_se_result["criteria_met"],
            "elective_criteria_met": elective_result["criteria_met"]
        }

    def _check_institutional_status(self, entity_type: EntityType) -> bool:
        """Check if client is an institutional entity"""
        return entity_type != EntityType.OTHER

    def _check_per_se_professional(self, client_data: ClientData) -> Dict:
        """Check Per Se Professional criteria"""
        criteria_met = 0
        metrics = {
            "balance_sheet": client_data.balance_sheet,
            "net_turnover": client_data.net_turnover,
            "own_funds": client_data.own_funds
        }

        if client_data.balance_sheet and client_data.balance_sheet >= 20_000_000:
            criteria_met += 1
        if client_data.net_turnover and client_data.net_turnover >= 40_000_000:
            criteria_met += 1
        if client_data.own_funds and client_data.own_funds >= 2_000_000:
            criteria_met += 1

        return {
            "qualified": criteria_met >= 2,
            "criteria_met": criteria_met,
            "metrics": metrics
        }

    def _check_elective_professional(self, client_data: ClientData) -> Dict:
        """Check Elective Professional criteria"""
        criteria_met = 0
        metrics = {}

        transaction_result = self._check_transaction_frequency(client_data.transactions)
        if transaction_result["qualified"]:
            criteria_met += 1
        metrics["transactions"] = transaction_result["metrics"]

        portfolio_result = self._check_portfolio_size(client_data.portfolio)
        if portfolio_result["qualified"]:
            criteria_met += 1
        metrics["portfolio"] = portfolio_result["metrics"]

        if client_data.experience:
            experience_result = self._check_professional_experience(client_data.experience)
            if experience_result["qualified"]:
                criteria_met += 1
            metrics["experience"] = experience_result["metrics"]

        return {
            "qualified": criteria_met >= 2,
            "criteria_met": criteria_met,
            "metrics": metrics
        }

    def _check_transaction_frequency(self, transactions: List[Transaction]) -> Dict:
        """Check if transaction frequency meets criteria"""
        if not transactions:
            return {"qualified": False, "metrics": {"avg_quarterly_transactions": 0}}

        df = pd.DataFrame([
            {
                "date": t.date,
                "value": t.value,
                "market": t.market
            } for t in transactions
        ])

        df["is_significant"] = df.apply(
            lambda x: x["value"] >= self.SIGNIFICANT_TRANSACTION_THRESHOLDS.get(
                x["market"], self.SIGNIFICANT_TRANSACTION_THRESHOLDS["default"]
            ),
            axis=1
        )

        df["quarter"] = df["date"].dt.to_period("Q")
        quarterly_counts = df[df["is_significant"]].groupby("quarter").size()

        last_4_quarters = quarterly_counts.tail(4)
        avg_quarterly_transactions = last_4_quarters.mean() if len(last_4_quarters) > 0 else 0

        return {
            "qualified": avg_quarterly_transactions >= 10,
            "metrics": {
                "avg_quarterly_transactions": avg_quarterly_transactions,
                "quarterly_breakdown": quarterly_counts.to_dict()
            }
        }

    def _check_portfolio_size(self, portfolio: Portfolio) -> Dict:
        """Check if portfolio size meets criteria"""
        total_value = portfolio.cash_deposits + portfolio.financial_instruments
        return {
            "qualified": total_value >= 500_000,
            "metrics": {
                "total_value": total_value,
                "cash_deposits": portfolio.cash_deposits,
                "financial_instruments": portfolio.financial_instruments
            }
        }

    def _check_professional_experience(self, experience: ProfessionalExperience) -> Dict:
        """Check if professional experience meets criteria"""
        qualified = (
            experience.sector.lower() == "financial" and
            experience.duration >= 1 and
            experience.position.lower() == "professional"
        )
        return {
            "qualified": qualified,
            "metrics": {
                "sector": experience.sector,
                "duration": experience.duration,
                "position": experience.position
            }
        }

def example_usage():
    client_data = ClientData(
        client_id="12345",
        entity_type=EntityType.OTHER,
        transactions=[
            Transaction(
                date=datetime.now() - timedelta(days=x),
                value=150000,
                market="fx_spot",
                type="spot"
            ) for x in range(0, 365, 7)
        ],
        portfolio=Portfolio(
            cash_deposits=300000,
            financial_instruments=300000
        ),
        experience=ProfessionalExperience(
            sector="financial",
            duration=1.5,
            position="professional"
        ),
        balance_sheet=25000000,
        net_turnover=45000000,
        own_funds=3000000
    )

    classifier = MiFIDClassifier()
    result = classifier.classify_client(client_data)

    print("Classification Result:")
    print(result)

if __name__ == "__main__":
    example_usage()

Classification Result:
{'classification': 'Per Se Professional', 'criteria_met': 3, 'metrics': {'balance_sheet': 25000000, 'net_turnover': 45000000, 'own_funds': 3000000}}


To visualise the data, using matplotlib and plotly

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import calendar

class MiFIDVisualizer:
    def __init__(self, classifier_results: Dict, client_data: ClientData):
        self.results = classifier_results
        self.client_data = client_data
        self.set_style()

    def set_style(self):
        """Set the style for matplotlib visualizations"""
        plt.style.use('default')  # Use default style instead of seaborn

    def create_dashboard(self):
        """Create a comprehensive dashboard of visualizations"""
        print(f"\nMiFID Classification Dashboard for Client {self.client_data.client_id}")
        print("=" * 50)

        self.plot_transaction_analysis()
        self.plot_portfolio_breakdown()
        self.plot_criteria_summary()
        self.plot_transaction_heatmap()
        if 'metrics' in self.results:
            self.plot_professional_criteria_radar()

    def plot_transaction_analysis(self):
        """Visualize transaction patterns"""
        df = pd.DataFrame([
            {
                'date': t.date,
                'value': t.value,
                'market': t.market,
                'type': t.type
            } for t in self.client_data.transactions
        ])

        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=(
                'Transaction Volume Over Time',
                'Transaction Distribution by Market',
                'Monthly Transaction Count',
                'Transaction Size Distribution'
            )
        )

        fig.add_trace(
            go.Scatter(
                x=df['date'],
                y=df['value'],
                mode='lines+markers',
                name='Transaction Value'
            ),
            row=1, col=1
        )

        market_dist = df.groupby('market')['value'].sum()
        fig.add_trace(
            go.Pie(
                labels=market_dist.index,
                values=market_dist.values,
                name='Market Distribution'
            ),
            row=1, col=2
        )

        monthly_count = df.groupby(df['date'].dt.strftime('%B'))['value'].count()
        fig.add_trace(
            go.Bar(
                x=monthly_count.index,
                y=monthly_count.values,
                name='Monthly Count'
            ),
            row=2, col=1
        )

        fig.add_trace(
            go.Histogram(
                x=df['value'],
                name='Size Distribution'
            ),
            row=2, col=2
        )

        fig.update_layout(
            height=800,
            showlegend=False,
            title_text="Transaction Analysis Dashboard",
            title_x=0.5
        )
        fig.show()

    def plot_portfolio_breakdown(self):
        """Visualize portfolio composition"""
        portfolio = self.client_data.portfolio

        fig = go.Figure()

        fig.add_trace(go.Pie(
            labels=['Cash Deposits', 'Financial Instruments'],
            values=[portfolio.cash_deposits, portfolio.financial_instruments],
            hole=0.4
        ))

        fig.update_layout(
            title={
                'text': 'Portfolio Composition',
                'x': 0.5
            },
            annotations=[{
                'text': f'Total: €{portfolio.cash_deposits + portfolio.financial_instruments:,.0f}',
                'showarrow': False,
                'font': {'size': 20}
            }]
        )

        fig.show()

    def plot_criteria_summary(self):
        """Visualize criteria fulfillment"""
        if 'metrics' in self.results:
            metrics = self.results['metrics']

            criteria_data = {
                'Criteria': [
                    'Transaction Frequency',
                    'Portfolio Size',
                    'Professional Experience'
                ],
                'Required': [10, 500000, 1],
                'Actual': [
                    metrics.get('transactions', {}).get('avg_quarterly_transactions', 0),
                    metrics.get('portfolio', {}).get('total_value', 0),
                    metrics.get('experience', {}).get('duration', 0)
                ]
            }

            df = pd.DataFrame(criteria_data)

            fig = go.Figure()

            fig.add_trace(go.Bar(
                name='Required',
                x=df['Criteria'],
                y=df['Required'],
                marker_color='lightgray'
            ))

            fig.add_trace(go.Bar(
                name='Actual',
                x=df['Criteria'],
                y=df['Actual'],
                marker_color='rgb(66, 135, 245)'
            ))

            fig.update_layout(
                title={
                    'text': 'Criteria Fulfillment Summary',
                    'x': 0.5
                },
                barmode='group'
            )

            fig.show()

    def plot_transaction_heatmap(self):
        """Create a heatmap of transaction activity"""
        df = pd.DataFrame([
            {
                'date': t.date,
                'value': t.value
            } for t in self.client_data.transactions
        ])

        df['month'] = df['date'].dt.month
        df['day'] = df['date'].dt.day

        heatmap_data = df.pivot_table(
            values='value',
            index='day',
            columns='month',
            aggfunc='sum'
        ).fillna(0)

        fig = go.Figure(data=go.Heatmap(
            z=heatmap_data.values,
            x=[calendar.month_abbr[m] for m in heatmap_data.columns],
            y=heatmap_data.index,
            colorscale='Viridis'
        ))

        fig.update_layout(
            title={
                'text': 'Transaction Activity Heatmap',
                'x': 0.5
            },
            xaxis_title='Month',
            yaxis_title='Day of Month'
        )

        fig.show()

    def plot_professional_criteria_radar(self):
        """Create a radar chart for professional criteria"""
        if self.client_data.balance_sheet and self.client_data.net_turnover and self.client_data.own_funds:
            criteria_values = [
                (self.client_data.balance_sheet / 20000000) * 100,
                (self.client_data.net_turnover / 40000000) * 100,
                (self.client_data.own_funds / 2000000) * 100
            ]

            fig = go.Figure()

            fig.add_trace(go.Scatterpolar(
                r=criteria_values,
                theta=['Balance Sheet', 'Net Turnover', 'Own Funds'],
                fill='toself',
                name='Actual vs Required (%)'
            ))

            fig.update_layout(
                polar=dict(
                    radialaxis=dict(
                        visible=True,
                        range=[0, max(criteria_values) + 10]
                    )),
                showlegend=False,
                title={
                    'text': 'Professional Criteria Radar Chart',
                    'x': 0.5
                }
            )

            fig.show()

In [None]:
import plotly.graph_objects as go
from datetime import datetime, timedelta

dates = pd.date_range(start='2023-01-01', end='2023-12-31', freq='D')
values = np.random.normal(500000, 100000, len(dates))
transactions = pd.DataFrame({
    'date': dates,
    'value': values
})

fig = go.Figure()

fig.add_trace(go.Bar(
    name='Portfolio Value',
    x=['Current Portfolio'],
    y=[750000],
    marker_color='#2C3E50'
))

fig.add_trace(go.Scatter(
    x=['Current Portfolio'],
    y=[500000],
    mode='lines',
    name='MiFID Requirement',
    line=dict(color='#E74C3C', width=2, dash='dash')
))

fig.update_layout(
    title={
        'text': 'Portfolio Value vs MiFID Requirement',
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    yaxis_title='EUR',
    plot_bgcolor='white',
    paper_bgcolor='white',
    font=dict(color='#2C3E50'),
    showlegend=True,
    height=500
)

fig.show()

quarterly_counts = transactions.groupby(pd.Grouper(key='date', freq='Q')).size()

fig2 = go.Figure()

fig2.add_trace(go.Bar(
    x=quarterly_counts.index.astype(str),
    y=quarterly_counts.values,
    name='Transactions',
    marker_color='#2C3E50'
))

fig2.add_trace(go.Scatter(
    x=quarterly_counts.index.astype(str),
    y=[10] * len(quarterly_counts),
    mode='lines',
    name='MiFID Requirement',
    line=dict(color='#E74C3C', width=2, dash='dash')
))

fig2.update_layout(
    title={
        'text': 'Quarterly Transaction Frequency',
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    xaxis_title='Quarter',
    yaxis_title='Number of Transactions',
    plot_bgcolor='white',
    paper_bgcolor='white',
    font=dict(color='#2C3E50'),
    showlegend=True,
    height=500
)

fig2.show()

criteria_data = {
    'Criteria': ['Balance Sheet', 'Net Turnover', 'Own Funds'],
    'Current': [25000000, 45000000, 3000000],
    'Required': [20000000, 40000000, 2000000]
}

fig3 = go.Figure()

fig3.add_trace(go.Bar(
    name='Current Value',
    x=criteria_data['Criteria'],
    y=criteria_data['Current'],
    marker_color='#2C3E50'
))

fig3.add_trace(go.Bar(
    name='Required Value',
    x=criteria_data['Criteria'],
    y=criteria_data['Required'],
    marker_color='#E74C3C'
))

fig3.update_layout(
    title={
        'text': 'Professional Criteria Status',
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    yaxis_title='EUR',
    plot_bgcolor='white',
    paper_bgcolor='white',
    font=dict(color='#2C3E50'),
    showlegend=True,
    height=500,
    barmode='group'
)

fig3.show()


'Q' is deprecated and will be removed in a future version, please use 'QE' instead.



Classification/Suitability of clients

In [None]:
class ClientClassification(Enum):
    RETAIL = "Retail"
    ELECTIVE_PROFESSIONAL = "Elective Professional"
    PER_SE_PROFESSIONAL = "Per Se Professional"
    INSTITUTIONAL = "Institutional Professional"

class EntityType(Enum):
    CREDIT_INSTITUTION = "credit_institution"
    INVESTMENT_FIRM = "investment_firm"
    FINANCIAL_INSTITUTION = "regulated_financial_institution"
    INVESTMENT_SCHEME = "collective_investment_scheme"
    PENSION_FUND = "pension_fund"
    COMMODITY_DEALER = "commodity_dealer"
    LOCAL_AUTHORITY = "local_authority"
    INSTITUTIONAL_INVESTOR = "institutional_investor"
    OTHER = "other"

@dataclass
class Transaction:
    date: datetime
    value: float
    market: str
    type: str

@dataclass
class Portfolio:
    cash_deposits: float
    financial_instruments: float

@dataclass
class ProfessionalExperience:
    sector: str
    duration: float  # in years
    position: str

@dataclass
class ClientData:
    client_id: str
    entity_type: EntityType
    transactions: List[Transaction]
    portfolio: Portfolio
    experience: Optional[ProfessionalExperience]
    balance_sheet: Optional[float]
    net_turnover: Optional[float]
    own_funds: Optional[float]

class MiFIDClassifier:
    def __init__(self):
        self.SIGNIFICANT_TRANSACTION_THRESHOLDS = {
            "fx_spot": 100000,
            "fx_forward": 250000,
            "fx_option": 500000,
            "default": 100000
        }

    def classify_client(self, client_data: ClientData) -> Dict:
        """Main classification method"""

        if self._check_institutional_status(client_data.entity_type):
            return {
                "classification": ClientClassification.INSTITUTIONAL.value,
                "reason": "Qualified as institutional entity",
                "category": client_data.entity_type.value
            }

        per_se_result = self._check_per_se_professional(client_data)
        if per_se_result["qualified"]:
            return {
                "classification": ClientClassification.PER_SE_PROFESSIONAL.value,
                "criteria_met": per_se_result["criteria_met"],
                "metrics": per_se_result["metrics"]
            }

        elective_result = self._check_elective_professional(client_data)
        if elective_result["qualified"]:
            return {
                "classification": ClientClassification.ELECTIVE_PROFESSIONAL.value,
                "criteria_met": elective_result["criteria_met"],
                "metrics": elective_result["metrics"]
            }

        return {
            "classification": ClientClassification.RETAIL.value,
            "reason": "Did not meet professional criteria",
            "per_se_criteria_met": per_se_result["criteria_met"],
            "elective_criteria_met": elective_result["criteria_met"]
        }

    def _check_institutional_status(self, entity_type: EntityType) -> bool:
        """Check if client is an institutional entity"""
        return entity_type != EntityType.OTHER

    def _check_per_se_professional(self, client_data: ClientData) -> Dict:
        """Check Per Se Professional criteria"""
        criteria_met = 0
        metrics = {
            "balance_sheet": client_data.balance_sheet,
            "net_turnover": client_data.net_turnover,
            "own_funds": client_data.own_funds
        }

        if client_data.balance_sheet and client_data.balance_sheet >= 20_000_000:
            criteria_met += 1
        if client_data.net_turnover and client_data.net_turnover >= 40_000_000:
            criteria_met += 1
        if client_data.own_funds and client_data.own_funds >= 2_000_000:
            criteria_met += 1

        return {
            "qualified": criteria_met >= 2,
            "criteria_met": criteria_met,
            "metrics": metrics
        }

    def _check_elective_professional(self, client_data: ClientData) -> Dict:
        """Check Elective Professional criteria"""
        criteria_met = 0
        metrics = {}

        transaction_result = self._check_transaction_frequency(client_data.transactions)
        if transaction_result["qualified"]:
            criteria_met += 1
        metrics["transactions"] = transaction_result["metrics"]

        portfolio_result = self._check_portfolio_size(client_data.portfolio)
        if portfolio_result["qualified"]:
            criteria_met += 1
        metrics["portfolio"] = portfolio_result["metrics"]

        if client_data.experience:
            experience_result = self._check_professional_experience(client_data.experience)
            if experience_result["qualified"]:
                criteria_met += 1
            metrics["experience"] = experience_result["metrics"]

        return {
            "qualified": criteria_met >= 2,
            "criteria_met": criteria_met,
            "metrics": metrics
        }

    def _check_transaction_frequency(self, transactions: List[Transaction]) -> Dict:
        """Check if transaction frequency meets criteria"""
        if not transactions:
            return {"qualified": False, "metrics": {"avg_quarterly_transactions": 0}}

        df = pd.DataFrame([
            {
                "date": t.date,
                "value": t.value,
                "market": t.market
            } for t in transactions
        ])

        df["is_significant"] = df.apply(
            lambda x: x["value"] >= self.SIGNIFICANT_TRANSACTION_THRESHOLDS.get(
                x["market"], self.SIGNIFICANT_TRANSACTION_THRESHOLDS["default"]
            ),
            axis=1
        )

        df["quarter"] = df["date"].dt.to_period("Q")
        quarterly_counts = df[df["is_significant"]].groupby("quarter").size()

        last_4_quarters = quarterly_counts.tail(4)
        avg_quarterly_transactions = last_4_quarters.mean() if len(last_4_quarters) > 0 else 0

        return {
            "qualified": avg_quarterly_transactions >= 10,
            "metrics": {
                "avg_quarterly_transactions": avg_quarterly_transactions,
                "quarterly_breakdown": quarterly_counts.to_dict()
            }
        }

    def _check_portfolio_size(self, portfolio: Portfolio) -> Dict:
        """Check if portfolio size meets criteria"""
        total_value = portfolio.cash_deposits + portfolio.financial_instruments
        return {
            "qualified": total_value >= 500_000,
            "metrics": {
                "total_value": total_value,
                "cash_deposits": portfolio.cash_deposits,
                "financial_instruments": portfolio.financial_instruments
            }
        }

    def _check_professional_experience(self, experience: ProfessionalExperience) -> Dict:
        """Check if professional experience meets criteria"""
        qualified = (
            experience.sector.lower() == "financial" and
            experience.duration >= 1 and
            experience.position.lower() == "professional"
        )
        return {
            "qualified": qualified,
            "metrics": {
                "sector": experience.sector,
                "duration": experience.duration,
                "position": experience.position
            }
        }

def example_usage():
    client_data = ClientData(
        client_id="12345",
        entity_type=EntityType.OTHER,
        transactions=[
            Transaction(
                date=datetime.now() - timedelta(days=x),
                value=150000,
                market="fx_spot",
                type="spot"
            ) for x in range(0, 365, 7)
        ],
        portfolio=Portfolio(
            cash_deposits=300000,
            financial_instruments=300000
        ),
        experience=ProfessionalExperience(
            sector="financial",
            duration=1.5,
            position="professional"
        ),
        balance_sheet=25000000,
        net_turnover=45000000,
        own_funds=3000000
    )

    classifier = MiFIDClassifier()
    result = classifier.classify_client(client_data)

    print("Classification Result:")
    print(result)

if __name__ == "__main__":
    example_usage()

Classification Result:
{'classification': 'Per Se Professional', 'criteria_met': 3, 'metrics': {'balance_sheet': 25000000, 'net_turnover': 45000000, 'own_funds': 3000000}}


In [None]:
from dataclasses import dataclass
from datetime import datetime, timedelta
from typing import List, Dict, Optional
import pandas as pd
import numpy as np
from enum import Enum
import plotly.graph_objects as go
import plotly.express as px
np.random.seed(42)

In [None]:
class ClientClassification(Enum):
    RETAIL = "Retail"
    ELECTIVE_PROFESSIONAL = "Elective Professional"
    PER_SE_PROFESSIONAL = "Per Se Professional"
    INSTITUTIONAL = "Institutional Professional"

class EntityType(Enum):
    CREDIT_INSTITUTION = "credit_institution"
    INVESTMENT_FIRM = "investment_firm"
    FINANCIAL_INSTITUTION = "regulated_financial_institution"
    INVESTMENT_SCHEME = "collective_investment_scheme"
    PENSION_FUND = "pension_fund"
    COMMODITY_DEALER = "commodity_dealer"
    LOCAL_AUTHORITY = "local_authority"
    INSTITUTIONAL_INVESTOR = "institutional_investor"
    OTHER = "other"

@dataclass
class Transaction:
    date: datetime
    value: float
    market: str
    type: str

@dataclass
class Portfolio:
    cash_deposits: float
    financial_instruments: float

@dataclass
class ProfessionalExperience:
    sector: str
    duration: float
    position: str

@dataclass
class ClientData:
    client_id: str
    entity_type: EntityType
    transactions: List[Transaction]
    portfolio: Portfolio
    experience: Optional[ProfessionalExperience]
    balance_sheet: Optional[float]
    net_turnover: Optional[float]
    own_funds: Optional[float]

In [33]:
def generate_fx_sales_dataset(num_clients=1000):
    """Generate a realistic FX sales dataset"""
    np.random.seed(42)

    company_sizes = ['Small', 'Medium', 'Large', 'Enterprise']
    size_probabilities = [0.4, 0.3, 0.2, 0.1]

    entity_types = list(EntityType)
    num_entity_types = len(entity_types)
    entity_probabilities = [0.45/(num_entity_types-1)] * (num_entity_types-1) + [0.55]

    dataset = []

    for i in range(num_clients):
        try:
            company_size = np.random.choice(company_sizes, p=size_probabilities)

            if company_size == 'Enterprise':
                balance_sheet = abs(np.random.normal(50_000_000, 10_000_000))
                net_turnover = abs(np.random.normal(100_000_000, 20_000_000))
                own_funds = abs(np.random.normal(5_000_000, 1_000_000))
                base_transaction_value = abs(np.random.normal(500_000, 100_000))
                transaction_frequency = abs(np.random.normal(15, 3))
            elif company_size == 'Large':
                balance_sheet = abs(np.random.normal(15_000_000, 5_000_000))
                net_turnover = abs(np.random.normal(35_000_000, 10_000_000))
                own_funds = abs(np.random.normal(1_500_000, 500_000))
                base_transaction_value = abs(np.random.normal(250_000, 50_000))
                transaction_frequency = abs(np.random.normal(10, 3))
            elif company_size == 'Medium':
                balance_sheet = abs(np.random.normal(5_000_000, 2_000_000))
                net_turnover = abs(np.random.normal(15_000_000, 5_000_000))
                own_funds = abs(np.random.normal(750_000, 250_000))
                base_transaction_value = abs(np.random.normal(100_000, 25_000))
                transaction_frequency = abs(np.random.normal(6, 2))
            else:  # Small
                balance_sheet = abs(np.random.normal(1_000_000, 500_000))
                net_turnover = abs(np.random.normal(5_000_000, 2_000_000))
                own_funds = abs(np.random.normal(250_000, 100_000))
                base_transaction_value = abs(np.random.normal(50_000, 10_000))
                transaction_frequency = abs(np.random.normal(3, 1))

            num_transactions = int(max(1, transaction_frequency * 52))  # Weekly transactions for a year
            transactions = []
            for j in range(num_transactions):
                transaction_value = max(10000, abs(np.random.normal(base_transaction_value, base_transaction_value * 0.1)))
                transaction_date = datetime.now() - timedelta(days=np.random.randint(0, 365))
                market_type = np.random.choice(['fx_spot', 'fx_forward', 'fx_option'], p=[0.7, 0.2, 0.1])

                transactions.append(Transaction(
                    date=transaction_date,
                    value=transaction_value,
                    market=market_type,
                    type=market_type.split('_')[1]
                ))

            portfolio_total = max(100000, abs(np.random.normal(balance_sheet * 0.1, max(1000, balance_sheet * 0.02))))
            portfolio = Portfolio(
                cash_deposits=portfolio_total * 0.3,
                financial_instruments=portfolio_total * 0.7
            )

            has_financial_experience = np.random.choice([True, False], p=[0.2, 0.8])
            if has_financial_experience:
                experience = ProfessionalExperience(
                    sector="financial",
                    duration=abs(np.random.uniform(0, 10)),
                    position="professional" if np.random.random() > 0.3 else "junior"
                )
            else:
                experience = ProfessionalExperience(
                    sector="other",
                    duration=abs(np.random.uniform(0, 15)),
                    position="other"
                )

            client_data = ClientData(
                client_id=f"CLIENT_{i:04d}",
                entity_type=np.random.choice(entity_types, p=entity_probabilities),
                transactions=transactions,
                portfolio=portfolio,
                experience=experience,
                balance_sheet=balance_sheet,
                net_turnover=net_turnover,
                own_funds=own_funds
            )

            dataset.append(client_data)

        except Exception as e:
            print(f"Error generating client {i}: {str(e)}")
            continue

    return dataset

try:
    dataset = generate_fx_sales_dataset(10)  # Start with a small sample
    print(f"Successfully generated {len(dataset)} client records")

    results_df = analyze_dataset(dataset)
    print("\nDataset Analysis:")
    print(results_df['classification'].value_counts())

    dashboard = FXSalesDashboard(results_df)
    dashboard.run_server(debug=True)

except Exception as e:
    print(f"Error: {str(e)}")

Successfully generated 10 client records

Dataset Analysis:
classification
Institutional Professional    8
Retail                        2
Name: count, dtype: int64


<IPython.core.display.Javascript object>

if 403 error occurs, use the following code

In [35]:
def create_interactive_dashboard(results_df):
    """Create interactive visualizations using Plotly"""

    # 1. Classification Distribution
    classification_counts = results_df['classification'].value_counts()
    fig1 = px.pie(
        values=classification_counts.values,
        names=classification_counts.index,
        title='Client Classification Distribution',
        template='plotly_white',
        color_discrete_sequence=px.colors.qualitative.Set3
    )
    fig1.show()

    # 2. Portfolio Size by Classification
    fig2 = px.box(
        results_df,
        x='classification',
        y='portfolio_size',
        title='Portfolio Size Distribution by Classification',
        template='plotly_white',
        color='classification'
    )
    fig2.update_layout(
        yaxis_title='Portfolio Size (EUR)',
        showlegend=False
    )
    fig2.show()

    # 3. Transaction Analysis
    fig3 = px.scatter(
        results_df,
        x='num_transactions',
        y='avg_transaction_value',
        color='classification',
        title='Transaction Analysis',
        template='plotly_white',
        size='portfolio_size',
        hover_data=['client_id', 'entity_type']
    )
    fig3.update_layout(
        xaxis_title='Number of Transactions',
        yaxis_title='Average Transaction Value (EUR)'
    )
    fig3.show()

    # 4. Entity Type Distribution
    entity_counts = results_df['entity_type'].value_counts()
    fig4 = px.bar(
        x=entity_counts.index,
        y=entity_counts.values,
        title='Distribution by Entity Type',
        template='plotly_white'
    )
    fig4.update_layout(
        xaxis_title='Entity Type',
        yaxis_title='Count',
        showlegend=False
    )
    fig4.show()

    # 5. Financial Metrics Comparison
    fig5 = go.Figure()
    metrics = ['balance_sheet', 'net_turnover', 'own_funds']

    for classification in results_df['classification'].unique():
        subset = results_df[results_df['classification'] == classification]
        for metric in metrics:
            fig5.add_trace(go.Box(
                y=subset[metric],
                name=f"{classification}<br>{metric.replace('_', ' ').title()}",
                boxpoints='outliers'
            ))

    fig5.update_layout(
        title='Financial Metrics by Classification',
        yaxis_title='EUR',
        template='plotly_white',
        showlegend=True
    )
    fig5.show()

    # Print Summary Statistics
    print("\nSummary Statistics:")
    print("=" * 50)

    print("\nClassification Distribution:")
    print(results_df['classification'].value_counts(normalize=True).round(3))

    print("\nAverage Portfolio Size by Classification:")
    print(results_df.groupby('classification')['portfolio_size'].mean().round(2))

    print("\nTransaction Statistics:")
    print(f"Average number of transactions: {results_df['num_transactions'].mean():.1f}")
    print(f"Average transaction value: €{results_df['avg_transaction_value'].mean():,.2f}")

    print("\nQualification Rates:")
    print(f"Proportion meeting portfolio criteria: {(results_df['portfolio_size'] >= 500000).mean():.1%}")
    print(f"Proportion meeting transaction criteria: {(results_df['num_transactions'] >= 40).mean():.1%}")
    print(f"Proportion with financial experience: {results_df['has_financial_experience'].mean():.1%}")

# Generate and analyze the dataset
try:
    print("Generating dataset...")
    dataset = generate_fx_sales_dataset(1000)
    print(f"Generated {len(dataset)} client records")

    print("\nAnalyzing dataset...")
    results_df = analyze_dataset(dataset)
    print("Analysis complete")

    print("\nCreating visualizations...")
    create_interactive_dashboard(results_df)

except Exception as e:
    print(f"Error occurred: {str(e)}")
    # Print more detailed error information
    import traceback
    print("\nDetailed error information:")
    print(traceback.format_exc())

Generating dataset...
Generated 1000 client records

Analyzing dataset...
Analysis complete

Creating visualizations...



Summary Statistics:

Classification Distribution:
classification
Institutional Professional    0.452
Retail                        0.326
Elective Professional         0.151
Per Se Professional           0.071
Name: proportion, dtype: float64

Average Portfolio Size by Classification:
classification
Elective Professional         1279035.42
Institutional Professional     956978.86
Per Se Professional           4602996.89
Retail                         209628.72
Name: portfolio_size, dtype: float64

Transaction Statistics:
Average number of transactions: 331.7
Average transaction value: €149,651.24

Qualification Rates:
Proportion meeting portfolio criteria: 44.1%
Proportion meeting transaction criteria: 99.3%
Proportion with financial experience: 20.6%


Data visualisation using python library Bokeh

In [39]:
from bokeh.plotting import figure, show
from bokeh.layouts import column, row, gridplot
from bokeh.palettes import Spectral6, Category20
from bokeh.transform import factor_cmap
from bokeh.models import ColumnDataSource, HoverTool, Legend, Div, DataTable, TableColumn
from bokeh.io import output_notebook

def create_interactive_analysis(results_df):
    """Create interactive visualizations using Bokeh"""

    source = ColumnDataSource(results_df)

    class_counts = results_df['classification'].value_counts()

    p1 = figure(
        x_range=class_counts.index.tolist(),
        height=400,
        title='Client Classification Distribution',
        toolbar_location=None,
        tools="hover",
        tooltips=[('Classification', '@x'), ('Count', '@top')]
    )

    p1.vbar(
        x=class_counts.index.tolist(),
        top=class_counts.values,
        width=0.9,
        fill_color=factor_cmap('x', Spectral6, class_counts.index.tolist()),
        line_color='white'
    )

    p1.xgrid.grid_line_color = None
    p1.xaxis.axis_label = 'Classification'
    p1.yaxis.axis_label = 'Number of Clients'
    p1.xaxis.major_label_orientation = 0.7

    p2 = figure(
        height=400,
        title='Portfolio Size Distribution',
        tools="pan,box_zoom,reset,hover",
        tooltips=[
            ('Classification', '@classification'),
            ('Portfolio Size', '€@portfolio_size{0,0}'),
            ('Client ID', '@client_id')
        ]
    )

    for classification, color in zip(results_df['classification'].unique(), Category20[20]):
        df_subset = results_df[results_df['classification'] == classification]
        source_subset = ColumnDataSource(df_subset)
        p2.scatter(
            'num_transactions',
            'portfolio_size',
            size=8,
            alpha=0.6,
            color=color,
            legend_label=classification,
            source=source_subset
        )

    p2.legend.click_policy = "hide"
    p2.xaxis.axis_label = 'Number of Transactions'
    p2.yaxis.axis_label = 'Portfolio Size (EUR)'

    p3 = figure(
        height=400,
        title='Transaction Analysis',
        tools="pan,box_zoom,reset,hover",
        tooltips=[
            ('Classification', '@classification'),
            ('Avg Transaction Value', '€@avg_transaction_value{0,0}'),
            ('Number of Transactions', '@num_transactions')
        ]
    )

    for classification, color in zip(results_df['classification'].unique(), Category20[20]):
        df_subset = results_df[results_df['classification'] == classification]
        source_subset = ColumnDataSource(df_subset)
        p3.circle(
            'num_transactions',
            'avg_transaction_value',
            size=8,
            alpha=0.6,
            color=color,
            legend_label=classification,
            source=source_subset
        )

    p3.legend.click_policy = "hide"
    p3.xaxis.axis_label = 'Number of Transactions'
    p3.yaxis.axis_label = 'Average Transaction Value (EUR)'

    qualification_metrics = pd.DataFrame({
        'Metric': ['Portfolio Size', 'Transaction Frequency', 'Financial Experience'],
        'Rate': [
            (results_df['portfolio_size'] >= 500000).mean() * 100,
            (results_df['num_transactions'] >= 40).mean() * 100,
            results_df['has_financial_experience'].mean() * 100
        ]
    })

    p4 = figure(
        x_range=qualification_metrics['Metric'].tolist(),
        height=400,
        title='Qualification Rates (%)',
        toolbar_location=None,
        tools="hover",
        tooltips=[('Metric', '@x'), ('Rate', '@top%')]
    )

    p4.vbar(
        x=qualification_metrics['Metric'].tolist(),
        top=qualification_metrics['Rate'],
        width=0.9,
        fill_color=factor_cmap('x', Spectral6[:3], qualification_metrics['Metric'].tolist()),
        line_color='white'
    )

    p4.xgrid.grid_line_color = None
    p4.xaxis.major_label_orientation = 0.7

    summary_stats = results_df.groupby('classification').agg({
        'portfolio_size': ['mean', 'count'],
        'avg_transaction_value': 'mean',
        'num_transactions': 'mean'
    }).round(2)

    summary_stats.columns = ['Avg Portfolio Size', 'Count', 'Avg Transaction Value', 'Avg Num Transactions']
    summary_stats = summary_stats.reset_index()

    columns = [
        TableColumn(field='classification', title='Classification'),
        TableColumn(field='Count', title='Count'),
        TableColumn(field='Avg Portfolio Size', title='Avg Portfolio (€)'),
        TableColumn(field='Avg Transaction Value', title='Avg Transaction (€)'),
        TableColumn(field='Avg Num Transactions', title='Avg Transactions')
    ]

    data_table = DataTable(
        source=ColumnDataSource(summary_stats),
        columns=columns,
        width=800,
        height=200
    )

    title_div = Div(text="<h1>MiFID Classification Analysis</h1>")

    layout = column(
        title_div,
        row(p1, p2),
        row(p3, p4),
        data_table
    )

    show(layout)

dataset = generate_fx_sales_dataset(1000)
results_df = analyze_dataset(dataset)

create_interactive_analysis(results_df)

