In [16]:
import os
from dotenv import load_dotenv

load_dotenv()

True

In [17]:
os.environ["AZURE_OPENAI_API_KEY"] = os.getenv("AZURE_OPENAI_API_KEY")
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")

In [43]:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

In [19]:
os.environ["AZURE_OPENAI_ENDPOINT"] = os.getenv("AZURE_ENDPOINT")

In [20]:
from langchain_openai import AzureChatOpenAI

In [21]:
llm = AzureChatOpenAI(model="gpt-4o",api_version="2024-05-01-preview")

In [22]:
print(llm)

client=<openai.resources.chat.completions.Completions object at 0x7f3118796ec0> async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x7f311819ec50> model_name='gpt-4o' openai_api_key=SecretStr('**********') openai_proxy='' azure_endpoint='https://genaiexplorencus.openai.azure.com/' openai_api_version='2024-05-01-preview' openai_api_type='azure'


In [23]:
response = llm.invoke("What is the difference between Parameter Efficient Fine Tuning and Standard Fine Tuning ")

In [24]:
from IPython.display import display, Markdown

In [25]:
response

AIMessage(content="Fine-tuning is a common technique in machine learning, particularly in the context of transfer learning, where a pre-trained model is adapted to a new, specific task. There are two primary approaches to fine-tuning: Standard Fine-Tuning and Parameter Efficient Fine-Tuning (PEFT). Here's a breakdown of the differences between them:\n\n### Standard Fine-Tuning\n1. **Scope of Adjustment**:\n   - In standard fine-tuning, the entire pre-trained model is unfrozen, and all parameters are adjusted based on the new task's training data.\n   \n2. **Resource Requirements**:\n   - This approach typically requires significant computational resources and memory, as all parameters of the model are being updated. \n\n3. **Flexibility**:\n   - It provides more flexibility and the potential for the model to fully adapt to the new task, as all weights can be modified.\n\n4. **Risk of Overfitting**:\n   - There's a higher risk of overfitting, especially if the new dataset is small, beca

In [26]:
display(Markdown(response.content))

Fine-tuning is a common technique in machine learning, particularly in the context of transfer learning, where a pre-trained model is adapted to a new, specific task. There are two primary approaches to fine-tuning: Standard Fine-Tuning and Parameter Efficient Fine-Tuning (PEFT). Here's a breakdown of the differences between them:

### Standard Fine-Tuning
1. **Scope of Adjustment**:
   - In standard fine-tuning, the entire pre-trained model is unfrozen, and all parameters are adjusted based on the new task's training data.
   
2. **Resource Requirements**:
   - This approach typically requires significant computational resources and memory, as all parameters of the model are being updated. 

3. **Flexibility**:
   - It provides more flexibility and the potential for the model to fully adapt to the new task, as all weights can be modified.

4. **Risk of Overfitting**:
   - There's a higher risk of overfitting, especially if the new dataset is small, because the model might over-adjust to the new task's specific data.

5. **Training Time**:
   - Generally, it takes longer to fine-tune the entire model, particularly for large models like BERT, GPT, or other deep learning architectures.

### Parameter Efficient Fine-Tuning (PEFT)
1. **Scope of Adjustment**:
   - PEFT focuses on adjusting a smaller, more targeted subset of parameters rather than the entire model. Techniques include adding small trainable modules (like adapters), tuning only the top layers, or using methods like LoRA (Low-Rank Adaptation).

2. **Resource Requirements**:
   - This approach is more resource-efficient, requiring less computational power and memory, as fewer parameters are being updated.

3. **Flexibility**:
   - While PEFT may not adapt the model as fully as standard fine-tuning, it often strikes a good balance between performance and efficiency, particularly for tasks that are not drastically different from the original task the model was pre-trained on.

4. **Risk of Overfitting**:
   - The risk of overfitting is generally lower with PEFT, as fewer parameters are being adjusted, reducing the model's capacity to overfit to the new dataset.

5. **Training Time**:
   - PEFT usually results in faster training times because fewer parameters need to be updated, making it more suitable for scenarios with limited resources or when quick adaptation is required.

### Summary
- **Standard Fine-Tuning**: Adjusts all parameters of the pre-trained model, offering maximum flexibility but at the cost of higher computational resources, longer training times, and a higher risk of overfitting.
- **Parameter Efficient Fine-Tuning (PEFT)**: Adjusts a smaller subset of parameters, offering a more resource-efficient and faster alternative with a reduced risk of overfitting, but potentially less flexibility in adapting to the new task.

Both approaches have their merits and are chosen based on the specific constraints and requirements of the task at hand.

### ChatPromptTemplate

In [27]:
from langchain_core.prompts import ChatPromptTemplate

In [28]:
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert AI Engineer. Answer questions based on user's queries"),
        ("user", "{input}")
    ]
)

In [29]:
prompt

ChatPromptTemplate(input_variables=['input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template="You are an expert AI Engineer. Answer questions based on user's queries")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])

In [30]:
chain = prompt | llm

In [31]:
res = chain.invoke({"input": "What is the difference between machine translation with Sequence to Sequence models vs transformers models? Which one should be used if I want to get custom output as per my project requirements for answering customer's queries related to products?"})

In [32]:
display(Markdown(res.content))

Machine translation has evolved significantly with the advent of different model architectures, particularly Sequence to Sequence (Seq2Seq) models and Transformer models. Here’s a breakdown of the differences between these two approaches and guidance on which one might be better suited for your project's requirements.

### Sequence to Sequence (Seq2Seq) Models:

**Architecture**:
- Seq2Seq models typically consist of an encoder and a decoder, often implemented using Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) networks.
- The encoder processes the input sequence and converts it into a fixed-size context vector.
- The decoder then generates the output sequence based on the context vector provided by the encoder.

**Strengths**:
- Effective for handling sequential data and maintaining order in input sequences.
- Can manage varying lengths of input and output sequences.

**Limitations**:
- Struggles with long-range dependencies due to the fixed-size context vector.
- Training can be slow since RNNs process data sequentially.
- Prone to issues like vanishing and exploding gradients.

### Transformer Models:

**Architecture**:
- Transformers are based on the self-attention mechanism, allowing them to weigh the importance of different parts of the input sequence differently.
- They consist of an encoder and a decoder but do not rely on RNNs. Instead, they use multiple layers of self-attention and fully connected layers.
- The attention mechanism allows the model to consider the entire sequence at once, rather than step-by-step as in RNNs.

**Strengths**:
- Handles long-range dependencies more effectively due to the self-attention mechanism.
- Can be parallelized, leading to faster training times.
- Generally provides better performance on a wide range of NLP tasks, including translation.

**Limitations**:
- Requires more computational resources compared to Seq2Seq models.
- Can be more complex to implement and fine-tune.

### Choosing the Right Model for Customer Query Answering:

Given your project requirements for answering customer queries related to products, you would likely benefit from the strengths of Transformer models. Here's why:

1. **Performance**: Transformers, especially models like BERT, GPT, or T5, have set new benchmarks in NLP tasks, including understanding and generating human-like text.
2. **Customizability**: Transformers can be fine-tuned on specific datasets, making them well-suited for creating custom outputs tailored to your product-related queries.
3. **Handling Context**: Transformers are adept at maintaining context over long sequences, which is crucial for understanding and responding to complex customer queries accurately.

### Practical Steps:

1. **Pre-trained Models**: Consider starting with pre-trained transformer models like BERT, GPT-3, or T5, which can be fine-tuned on your specific dataset.
2. **Fine-tuning**: Collect a dataset of customer queries and corresponding appropriate responses. Fine-tune the pre-trained model on this dataset to adapt it to your specific needs.
3. **Deployment**: Use frameworks like Hugging Face's Transformers library, which simplifies working with these models and provides tools for training, fine-tuning, and deployment.

### Conclusion:

While Seq2Seq models have their merits, Transformer models are generally more powerful and flexible for modern NLP tasks, including generating custom responses to customer queries. Their ability to handle long-range dependencies and context makes them a better fit for your project requirements.

In [44]:
res1 = llm1.invoke("What is the role of AI Engineer vs Data Scientist vs Machine Learning Engineer in AI Projects?")

In [46]:
display(Markdown(res1.content))

AI Engineer, Data Scientist, and Machine Learning Engineer are all crucial roles in AI projects, each with distinct responsibilities and skill sets.

1. AI Engineer: An AI Engineer is responsible for designing and developing AI systems and algorithms to solve complex problems. They typically have a strong background in computer science, mathematics, and engineering. AI Engineers work on developing AI models, integrating them into existing systems, and optimizing their performance. They also work on improving AI algorithms and ensuring they meet the desired objectives. Overall, AI Engineers focus on the technical aspects of AI development and implementation.

2. Data Scientist: A Data Scientist is responsible for analyzing and interpreting large amounts of data to derive insights and make informed decisions. They have expertise in statistics, data analysis, and machine learning. Data Scientists work on cleaning and preprocessing data, building predictive models, and conducting experiments to test hypotheses. They also communicate their findings to stakeholders and provide recommendations for business decisions. Data Scientists focus on extracting value from data and leveraging it to drive strategic initiatives.

3. Machine Learning Engineer: A Machine Learning Engineer is responsible for building and deploying machine learning models in production environments. They have expertise in machine learning algorithms, model training, and deployment. Machine Learning Engineers work on collecting and preparing data, building and training models, and deploying them into production systems. They also monitor model performance, optimize algorithms, and ensure they are scalable and reliable. Machine Learning Engineers focus on the practical implementation of machine learning solutions.

In summary, AI Engineers focus on developing AI systems and algorithms, Data Scientists focus on analyzing and deriving insights from data, and Machine Learning Engineers focus on building and deploying machine learning models. Each role is essential in AI projects and contributes to the successful development and implementation of AI solutions.