In [2]:
import ollama

response = ollama.generate(model='llama3:8b',
            prompt='what is a Large language model?')
print(response['response'])

A large language model (LLM) is a type of artificial intelligence (AI) designed to process and generate human-like text at an unprecedented scale. LLMs are trained on vast amounts of text data, allowing them to learn patterns, relationships, and nuances in language that enable them to understand and respond to natural language inputs.

Characteristics of Large Language Models:

1. **Scale**: LLMs are massive in size, often consisting of billions or even trillions of parameters, which allows them to capture complex relationships between words.
2. **Training data**: They're trained on enormous datasets, typically sourced from the internet, books, and other sources, allowing them to learn about various topics, styles, and languages.
3. **Self-supervised learning**: LLMs often use self-supervised learning techniques, where they're trained to predict specific aspects of text, such as next words or sentence completions, rather than being explicitly supervised with labeled data.
4. **Transfor

In [1]:
dctSummarize = {
    "Extracted Data": {
        "Company Name": "HealthInc",
        "Industry": "Healthcare",
        "Market Capitalization": 3000,
        "Revenue (in millions)": 1000,
        "EBITDA (in millions)": 250,
        "Net Income (in millions)": 80,
        "Debt (in millions)": 150,
        "Equity (in millions)": 666,
        "Enterprise Value (in millions)": 3150,
        "P/E Ratio": 15,
        "Revenue Growth Rate (%)": 12,
        "EBITDA Margin (%)": 40,
        "Net Income Margin (%)": 8,
        "ROE (Return on Equity) (%)": 13.33,
        "ROA (Return on Assets) (%)": 10,
        "Debt to Equity Ratio": 0.25,
        "Location": "New York, NY",
        "CEO": "Jane Smith",
        "Number of Employees": 3000
    },
    "Internal Data": {
        "Company Name": "HealthInc",
        "Industry": "Healthcare",
        "Market Capitalization": 3000,
        "Revenue (in millions)": 1000,
        "EBITDA (in millions)": 250,
        "Net Income (in millions)": 80,
        "Debt (in millions)": 150,
        "Equity (in millions)": 600,
        "Enterprise Value (in millions)": 3150,
        "P/E Ratio": 15,
        "Revenue Growth Rate (%)": 12,
        "EBITDA Margin (%)": 25.0,
        "Net Income Margin (%)": 8.0,
        "ROE (Return on Equity) (%)": 13.33,
        "ROA (Return on Assets) (%)": 10.0,
        "Current Ratio": 2.0,
        "Debt to Equity Ratio": 0.25,
        "Location": "New York"
    },
    "Discrepancies": {
        "Equity (in millions)": {
            "Extracted": 666,
            "Internal": 600
        },
        "EBITDA Margin (%)": {
            "Extracted": 40,
            "Internal": 25.0
        },
        "Location": {
            "Extracted": "New York, NY",
            "Internal": "New York"
        }
    },
    "Missing in Extracted Data": [
        "Current Ratio"
    ],
    "Missing in Internal Data": [
        "CEO",
        "Number of Employees"
    ]
}

In [3]:
from LLM.llms_openAI import prompttemplate, extractPromptData
from langchain_core.output_parsers import StrOutputParser
from langchain_community.llms import Ollama
from LLM.chain import create_prompt_template
from langchain_core.prompts import PromptTemplate
import ollama

In [10]:
company_name = 'HealthInc'

   

In [6]:
promp_template = prompttemplate()
input_variables = ["internal_data", "external_data", "discrepancy_fields", "missing_extracted_fields", "missing_internal_fields", "company name"]

# Create the PromptTemplate instance
prompt = PromptTemplate(input_variables=input_variables, template=promp_template)
llm = Ollama(model="llama3:8b")
chain =  prompt | llm | StrOutputParser()
data_with_prompt_data = extractPromptData(dctSummarize, company_name)
summary = chain.invoke(data_with_prompt_data)

In [7]:
print(summary)

**Data Source Comparison Check Summary**

### Internal Data vs. External Data Comparison

After reviewing the internal and external data sources, we identified discrepancies in the following fields:

* **Equity (in millions)**: The internal data source reports an equity value of 600 million, whereas the external data source reports a value of 666 million.
* **EBITDA Margin (%)**: The internal data source reports an EBITDA margin of 25.0%, whereas the external data source reports a margin of 40%.
* **Location**: Although both sources report New York as the location, the external data source includes an additional detail (New York, NY).

We also identified missing fields from each source:

* **Missing from Extracted Data:** Current Ratio
* **Missing from Internal Data:** CEO, Number of Employees

### Discrepancy Analysis

The discrepancies in equity values may indicate a potential error or difference in reporting methodologies. The disparity in EBITDA margins could be due to differences 

In [5]:
from langchain_openai import ChatOpenAI

In [7]:
def extractPromptdetails(dctSummarize):
        # Construct the findings string dynamically based on data dictionary
    finding_lines = []
    for field, discrepancy in dctSummarize["Discrepancies"].items():
        extracted_value = dctSummarize["Extracted Data"][field]
        internal_value = dctSummarize["Internal Data"][field]
        finding_lines.append(f"\n* Discrepancy in {field}:")
        finding_lines.append(f"  * Extracted Data: {extracted_value}")
        finding_lines.append(f"  * Internal Data: {internal_value}")

    for missing_field in dctSummarize["Missing in Extracted Data"]:
        finding_lines.append(f"\n* Missing field in Extracted Data: {missing_field}")

    for missing_field in dctSummarize["Missing in Internal Data"]:
        finding_lines.append(f"\n* Missing field in Internal Data: {missing_field}")

    findings_str = "\n".join(finding_lines)
    return findings_str

In [11]:
# Define a template for the prompt with placeholders
prompt_template = """Based on the analysis of financial data from two sources for {company_name}, the following key findings can be summarized regarding its financial health:

{findings}

Overall, the discrepancies and missing information in the financial data of {company_name} raise concerns about the accuracy, transparency, and completeness of the company's financial reporting. These gaps make it challenging to make a comprehensive assessment of {company_name}'s financial health.
"""

# Specify input variables for the template
input_variables = ["company_name", "findings"]

# Create the PromptTemplate instance
prompt = PromptTemplate(input_variables=input_variables, template=prompt_template)

# Define LLM chain
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo-16k")

findings_str = extractPromptdetails(dctSummarize)

# Prepare data dictionary with company name and findings
data = {
"company_name": company_name,
"findings": findings_str
}

chain = prompt | llm | StrOutputParser()

# Run the LLM chain with prepared data
summary = chain.invoke(data)

In [13]:
print(summary)

The discrepancy in equity between the extracted data and internal data suggests that there may be inconsistencies in how the company's equity is reported. This could be due to errors in data extraction or discrepancies in the company's financial records.

The difference in EBITDA margin indicates a significant variation in the company's profitability. The extracted data shows a higher EBITDA margin of 40%, while the internal data reports a lower margin of 25%. This difference could be attributed to different calculation methods or inaccuracies in the data sources.

The discrepancy in location, with the extracted data specifying "New York, NY" and the internal data stating "New York," raises questions about the accuracy of the company's location information. It is important to have consistent and accurate location data for financial analysis and reporting purposes.

The missing field of the current ratio in the extracted data suggests that this important financial metric is not availabl

In [12]:
#print(summary)
import textwrap

wrapped_text = textwrap.fill(summary, width=10) #text is the object which you want to print
print(wrapped_text)

Company Name | HealthInc | | Industry | Healthcare | | Market Capitalization | 3000 | | Revenue (in millions) | 1000 | |
EBITDA (in millions) | 250 | | Net Income (in millions) | 80 | | Debt (in millions) | 150 | | Equity (in millions) | 600
| | Enterprise Value (in millions) | 3150 | | P/E Ratio | 15 | | Revenue Growth Rate (%) | 12 | | EBITDA Margin (%) |
25.0 | | Net Income Margin (%) | 8.0 | | ROE (Return on Equity) (%) | 13.33 | | ROA (Return on Assets) (%) | 10.0 | |
Current Ratio | 2.0 | | Debt to Equity Ratio | 0.25 | | Location | New York |  **External Data**  | Field | Value | |
--- | --- | | Company Name | HealthInc | | Industry | Healthcare | | Market Capitalization | 3000 | | Revenue (in
millions) | 1000 | | EBITDA (in millions) | 250 | | Net Income (in millions) | 80 | | Debt (in millions) | 150 | |
Equity (in millions) | 666 | | Enterprise Value (in millions) | 3150 | | P/E Ratio | 15 | | Revenue Growth Rate (%) | 12
| | EBITDA Margin (%) | 40 | | Net Income Margin (%) |