In [1]:
import os
import dotenv
from groq import Groq
from IPython.display import display_markdown

dotenv.load_dotenv()
api_key = os.environ.get("GROQ_API_KEY")

if not api_key:
    raise ValueError("GROQ_API_KEY environment variable not set")

In [2]:
def get_llama_assistance(prompt):
  client = Groq(api_key=api_key)
  completion = client.chat.completions.create(
      model="llama3-70b-8192",
      messages=[
          {
              "role": "user",
              "content": prompt
          },
          {
              "role": "assistant",
              "content": ""
          }
      ],
      temperature=1.4,
      max_tokens=8192,
      top_p=1,
      stream=True,
      stop=None,
  )

  response_text = ""

  for chunk in completion:
    response_text += chunk.choices[0].delta.content or ""

  # Display the formatted response in markdown
  display_markdown(f"`\n{response_text}\n`", raw=True)

In [3]:
cwd = os.getcwd()
path1 = os.path.dirname(cwd)
poppler_path = os.path.join(path1, 'Gemini', 'poppler-24.02.0', 'Library', 'bin')
print(poppler_path)

d:\CODING\Generative-AI\Gemini\poppler-24.02.0\Library\bin


In [9]:
get_llama_assistance("can I get a code to extract the data from a PDF file?")

`
Extracting data from a PDF file can be a bit challenging, but it's doable using various libraries and techniques depending on the programming language you prefer. Here are a few examples:

**Python**

You can use libraries like PyPDF2, pdfminer, or pdfquery to extract data from PDF files.

Example using PyPDF2:
```python
import PyPDF2

# Open the PDF file
pdf_file = open('example.pdf', 'rb')
pdf_reader = PyPDF2.PdfFileReader(pdf_file)

# Extract text from the PDF
text = ''
for page in pdf_reader.pages:
    text += page.extractText()

print(text)

# Close the PDF file
pdf_file.close()
```
Example using pdfminer:
```python
from pdfminer.layout import LAParams
from pdfminer.converter import PDFPageAggregator
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfparser import PDFParser

# Open the PDF file
with open('example.pdf', 'rb') as file:
    parser = PDFParser(file)
    document = PDFDocument(parser)

# Extract text from the PDF
resource_manager = PDFResourceManager()
params = LAParams()
device = PDFPageAggregator(resource_manager, laparams=params)
interpreter = PDFPageInterpreter(resource_manager, device)

for page in PDFPage.create_pages(document):
    interpreter.process_page(page)
    layout = device.get_result()
    for element in layout:
        if isinstance(element, pdfminer.layout.LTTextContainer):
            for text_line in element:
                print(text_line.get_text())
```
**Java**

You can use libraries like Apache PDFBox or iText to extract data from PDF files.

Example using Apache PDFBox:
```java
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;

// Open the PDF file
PDDocument pdf = PDDocument.load(new File("example.pdf"));

// Extract text from the PDF
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(pdf);

System.out.println(text);

// Close the PDF file
pdf.close();
```
**R**

You can use libraries like pdftools or pdf2text to extract data from PDF files.

Example using pdftools:
```R
library(pdftools)

# Open the PDF file
pdf_file <- "example.pdf"

# Extract text from the PDF
text <- pdf_text(pdf_file)

print(text)
```
These are just a few examples, and the specific approach will depend on the complexity of the PDF file, the structure of the data you want to extract, and the programming language you're using. Additionally, the quality of the extracted data may vary depending on the quality of the PDF file and the extraction algorithm used.
`

In [10]:
# !pip install pdfquery
# !pip install pdfplumber

In [14]:
import pdfplumber
path = r'Resume\Aaditya_Pandagle_Resume_2024.pdf'

with pdfplumber.open(path) as pdf:
  all_text = ""
  # Below is for storing the extracted data in list 
  # extracted_text = []
  for page in pdf.pages:
    all_text += page.extract_text() + "\n\n"
    # extracted_text.append(page.extract_text()) 
    
print(all_text)

Work and Projects
A a d i t y a
Fresher with a strong theoretical understanding of company systems and operations.
P a n d a g l e While not having direct work experience, you can explore my GitHub profile where I’ve
performed statistical analysis and created predictive models. The portfolio spans various
topics such as:
Data Science | AI & Machine Learning |
MLOps Machine Learning:
Fraud Detection: Identify fraudulent activities within financial transactions using ML
algorithms on authentic datasets.
Credit Risk Assessment: Assessing credit risk for loan applicants using predictive
Contact:
modelling.
House Rent Prediction: Using machine learning algorithms to predict house rent
adi.pandagle@gmail.com
prices in India.
+91 79725 67926 Customer Discount Offering Prediction: Predict which customers will likely receive
discounts for targeted marketing.
LinkedIn and GitHub link Salary Prediction: Predicting individual salaries based on demographic and
employment details.
Education: Deep Le

In [15]:
get_llama_assistance(f'''I need your help to summarize the Resume I extracted from a PDF:
                     {all_text}''')

`
Here is a summarized version of the resume:

**Contact Information:**

* Email: [adi.pandagle@gmail.com](mailto:adi.pandagle@gmail.com)
* Phone: +91 79725 67926
* LinkedIn and GitHub links (not provided)

**Summary:**
Fresh graduate with a strong theoretical understanding of company systems and operations, with a focus on data science, AI, and machine learning. Portfolio includes projects on fraud detection, credit risk assessment, house rent prediction, and more.

**Education:**

* Post-Graduation Program in Data Science and Analytics, Imarticus Learning (2023-2024)
* Bachelor's in Business Administration (Computer Application), Modern College of Commerce & Computer Studies, Savitribai Phule Pune University (2020-2023)
* Senior Secondary School, PDEA's Shri Mhalsakant Vidyalaya & Jr. College (2018-2020)
* High School, S.P.M English Medium School (2007-2018)

**Certifications:**

* Udemy: Python for Data Science and Machine Learning Bootcamp
* Kaggle: Beginner and Intermediate Machine Learning
* HackerRank: SQL Basic and Intermediate
* Udemy: Python Data Structures and Algorithm
* Udemy: Amazon Web Services - Solution Architect

**Skills:**

* Machine Learning: Supervised, Unsupervised, and Deep Learning
* Data Science: Data Analysis, Visualization, Statistical Analysis, and Modeling
* Cloud: Amazon Web Services (EC2, S3, SageMaker, and others)
* Web Development: Flask, FastAPI, Django
* Programming Languages: Python
* Databases: MySQL
* MLOps: MLflow, Packaging, Version Control (Git), Docker, Jenkins, and CI/CD Pipeline
* Auxiliary Skills: Data Structures and Algorithms, Cloud, Problem-solving, and Critical Thinking

**Hobbies:**

* Avid reader
* Artist (sketching figures and objects)
* Strategic chess player
* Video/board games
* Coding
* Listening and creating Lo-fi tracks for fun on YouTube (novice)

Let me know if you'd like me to reorganize or clarify anything!
`