## Connecting to Docks

In [22]:
%pip install google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client -q
%pip install python-dotenv -q

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [23]:
import os
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from dotenv import load_dotenv

In [None]:


# I only want read access
SCOPES = ['https://www.googleapis.com/auth/documents.readonly']

# Paths to token and credits
#Client secret json is downloaded from my Google Cloud Console
CREDS_PATH = 'auth/client_secret.json'
# Token json will be auto-created after you authorize your app the first time
TOKEN_PATH = 'auth/token.json'

def get_credentials():
    creds = None
    if os.path.exists(TOKEN_PATH):
        creds = Credentials.from_authorized_user_file(TOKEN_PATH, SCOPES)
    else:
        flow = InstalledAppFlow.from_client_secrets_file(CREDS_PATH, SCOPES)
        creds = flow.run_local_server(port=0)
        with open(TOKEN_PATH, 'w') as token:
            token.write(creds.to_json())
    return creds

def read_google_doc(doc_id):
    creds = get_credentials() #Makes sure I am authenticated
    service = build('docs', 'v1', credentials=creds) #This connects to Docs API
    doc = service.documents().get(documentId=doc_id).execute() #Get the content
    
    content = ""
    for element in doc.get("body").get("content"):
        if "paragraph" in element:
            for text_run in element["paragraph"].get("elements", []):
                if "textRun" in text_run:
                    content += text_run["textRun"]["content"]
    return content

#Loads the environment variables, where I have saved the Doc Id
load_dotenv()
doc_id = os.getenv("GOOGLE_DOC_ID")

App will read the document every time I run it. 

Content could be cached and stored in a file, but with 8 pages it only took about 4 seconds, and adds additional value, that content is up to date, even if i make changes between uses of the app. Example on how to save the cached text locally is down below

In [None]:
experience_text = read_google_doc(doc_id)

#print some of the content to show that this is working
print(experience_text[:1000])


Linkedin: https://www.linkedin.com/in/ola-graczyk/
 
1.      Work experience
1.1   UX/UI Designer
Time: October 2024  - November 2024 ( 2 months )
Company: DynamicoAI (Now Firemind)
About the company: DynamicoAI owned a product called Enhanced IQ. It is an enterprise-grade platform that makes it simple to scale generative AI across the organisation. The platform can be integrated with the most popular databases and systems and sensitive data is protected. It can be used by employees on all levels for example to search company sharepoint,  retrieve data, write structured prompts to make working faster.
About my role:
As part of my role at DynamicoAI, I was responsible for evaluating the Enhanced IQ platform with a fresh perspective, identifying potential issues, and suggesting improvements. My key responsibilities included:
-        Conducted a thorough review of the platform, documenting all observations, usability concerns, and bugs.
-        Compiled findings into a structured report

In [5]:
# Save locally
with open('cached_experience.txt', 'w', encoding='utf-8') as f:
    f.write(experience_text)

# Later, you can read from file without hitting Google API
# with open('cached_experience.txt', 'r', encoding='utf-8') as f:
#     experience_text = f.read()

## Scrape the web

In [1]:
%pip install requests beautifulsoup4 -q
from web_scraper import scrape

Note: you may need to restart the kernel to use updated packages.


In [7]:
url = input("Paste the link to job description: ")

scrape(url)

with open('job_description.txt', 'r', encoding='utf-8') as f:
    job_description = f.read()

print(job_description[300:600])

Paste the link to job description:  https://justjoin.it/job-offer/allegro-junior-machine-learning-engineer---e-xperience-associate-warszawa-ai


w
AI/ML
Junior Machine Learning Engineer - e-Xperience Associate
Allegro
Warszawa
Type of work
Full-time
Experience
Junior
Employment Type
Permanent
Operating mode
Hybrid
Allegro
At Allegro, we build and maintain some of the most distributed and scalable applications in Central Europe. Work with us 


## LLMs + Langchain

### Gemini

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

api_key = os.getenv("GOOGLE_API_KEY")

In [2]:
%pip install --quiet langchain-core==0.1.23
%pip install --quiet langchain==0.1.1
%pip install --quiet langchain-google-genai==0.0.6
%pip install --quiet -U langchain-community==0.0.20

Note: you may need to restart the kernel to use updated packages.


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.3.24 requires langchain-text-splitters<1.0.0,>=0.3.8, which is not installed.
langchain 0.3.24 requires langchain-core<1.0.0,>=0.3.55, but you have langchain-core 0.1.23 which is incompatible.
langchain 0.3.24 requires langsmith<0.4,>=0.1.17, but you have langsmith 0.0.87 which is incompatible.


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [None]:
from langchain import PromptTemplate
from langchain.schema import StrOutputParser
# from langchain.schema.prompt_template import format_document

In [5]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro")

  from .autonotebook import tqdm as notebook_tqdm


In [18]:
with open("job_prompt_template.txt", "r") as file:
   job_prompt_template = file.read()

job_prompt = PromptTemplate.from_template(job_prompt_template)

with open("job_description.txt", "r") as file:
    job_description = file.read()

stuff_chain = (
    {"text": lambda _: job_description}
    | job_prompt         # Prompt for Gemini
    | llm                # Gemini API function
    | StrOutputParser()  # output parser
)

# Invoke the chain
job_description_result = stuff_chain.invoke({})

# Print the result
print(job_description_result)


1. **Title of a job:** Junior Machine Learning Engineer - e-Xperience Associate

2. **Role Summary Sentences:** This junior-level role focuses on deploying and maintaining machine learning models/algorithms within Allegro Ads.  The engineer will design, develop, and optimize ML pipelines, collaborating with data scientists and software engineers throughout the model lifecycle.  Responsibilities also include model management, monitoring, and tuning in a production environment.

3. **Seniority Level and Years of Experience:** Junior level. The posting explicitly mentions "Junior" and is structured as an associate program (e-Xperience 2025) suggesting it's an entry-level position or intended for recent graduates with limited practical experience.  While specific years of experience aren't stated, proficiency in Python, SQL, and a passion for ML applications are key requirements.

4. **Tech stack:** Python, SQL, GCP (Big Query, Composer, Vertex AI, Dataproc, Looker Studio), PySpark, Pandas

In [None]:
with open("job_prompt_template.txt", "r") as file:
   job_prompt_template = file.read()

job_prompt = PromptTemplate.from_template(job_prompt_template)

with open("job_description.txt", "r") as file:
    job_description = file.read()

offer_chain = (
    {"text": lambda _: job_description}
    | job_prompt         # Prompt for Gemini
    | llm                # Gemini API function
    | StrOutputParser()  # output parser
)

# Invoke the chain
job_description_result = offer_chain.invoke({})

print(job_description_result)

In [None]:
# with open("exp_prompt_template.txt", "r") as file:
#    exp_prompt_template = file.read()

# exp_prompt = PromptTemplate.from_template(exp_prompt_template)

# with open("cached_experience.txt", "r") as file:
#     full_experience = file.read()

# exp_chain = (
#     {"experience": lambda _: full_experience, "job_offer": lambda _: job_description_result}
#     | exp_prompt         # Prompt for Gemini
#     | llm                # Gemini API function
#     | StrOutputParser()  
# )

# short_experience = exp_chain.invoke({})

# print(short_experience)

1.2. Machine Learning Engineer with Python
Time: October 2023  - September 2024
Company: Fourteen33
About the company: Fourteen33 provides AI/ML, cloud and marketplace services. Partners with Google
About my role: Machine Learning Engineer with Python
-        Contributed to the Gemini Cookbook by creating eight comprehensive notebooks and two Gemini-powered Streamlit applications showcasing various Gemini API functionalities. This involved researching, developing, testing, and enhancing each notebook to ensure accuracy and usability.
-        Tested and improved all notebooks and applications submitted to the Gemini Cookbook by the team, ensuring their quality and consistency. This dedication to quality enabled the team to maintain high standards and deliver reliable, valuable resources for Gemini users.
-        Developed an internal application for document search using Vertex AI, enhancing team efficiency and productivity by enabling fast and accurate document retrieval.
- Notebook

In [None]:
with open("cv_prompt_template.txt", "r") as file:
   cv_prompt_template = file.read()

cv_prompt = PromptTemplate.from_template(cv_prompt_template)

with open("cached_experience.txt", "r") as file:
    full_experience = file.read()

cv_chain = (
    {"experience": lambda _: full_experience, "job_description": lambda _: job_description_result}
    | cv_prompt         
    | llm                
    | StrOutputParser()  
)

cv = cv_chain.invoke({})

In [21]:
# Save the CV text as a markdown file
with open("generated_cv.md", "w", encoding="utf-8") as file:
    file.write(cv)

# Ola Graczyk
(123) 456-7890 | ola.graczyk@email.com | linkedin.com/in/ola-graczyk | Warsaw, Poland

## Professional Summary

A highly motivated and curious Junior Machine Learning Engineer with a passion for practical ML applications and a proven ability to develop and implement ML solutions.  Proficient in Python, SQL, and eager to contribute to the e-Xperience Associate program at Allegro Ads.  Experienced in developing Gemini-powered Streamlit applications and conducting research on Named Entity Recognition (NER) using Vertex AI.  Seeking to leverage my analytical skills and strong learning aptitude to contribute to the model lifecycle, from development to deployment and ongoing optimization.

## Experience

**Fourteen33, Machine Learning Engineer with Python** | October 2023 - September 2024

* Contributed to the Gemini Cookbook by developing eight notebooks and two Gemini-powered Streamlit applications, demonstrating proficiency in Python and practical ML application development.
* Enhanced the quality and consistency of the Gemini Cookbook by testing and improving all submitted notebooks and applications.
* Demonstrated a strong understanding of Gemini's capabilities by brainstorming and presenting ten innovative application ideas.
* Led research and development for a client project focusing on Named Entity Recognition (NER), showcasing analytical skills and a proactive approach to problem-solving.
* Developed an internal application for document search using Vertex AI, improving team efficiency and demonstrating experience with GCP services.
* Completed the Machine Learning Engineer Certification, confirming a solid foundation in ML principles.

**DynamicoAI (Now Firemind), UX/UI Designer** | October 2024 - November 2024

* Conducted a comprehensive review of the Enhanced IQ platform, identifying usability issues and bugs, and documenting findings to improve platform performance and user experience.
* Designed UI modifications using Figma, proposing improvements to button layouts and page structures.
* Reviewed and simplified complex prompts for LLMs, leveraging knowledge of best practices in prompt engineering.
* Created comprehensive documentation detailing findings and proposed changes, demonstrating strong communication skills.

## Education

**SGH Warsaw School of Economics, Master of Science, Advanced Analytics - Big Data** | September 2024 - Present

* Relevant coursework: Python Programming, Big Data Programming in Databricks, Cloud Computing in AWS, Data Mining (Python and SAS), Databases in Oracle APEX, Real-time Analysis with Kafka.

**Adam Mickiewicz University in Poznań, Bachelor of Science, Cognitive Science** | September 2021 - July 2024

* Conducted research and an experiment with 40 participants for a bachelor's thesis on the interplay between human cognition and AI.
* Relevant coursework:  Methodology and Statistics, Algorithmics, Python Programming, Data Analysis and Visualization, Artificial Intelligence and Artificial Life (programming), Artificial Intelligence Methods (programming).

## Projects

**Hackathon Winner at WPiK UAM (2023)** | [GitHub Repository](https://github.com/tykfikk/daj-DUHA/blob/main/DUHA8)

* Led a team of four to develop a first-place, Python-based application using the OpenAI API during a 20-hour hackathon.
* Leveraged cognitive science knowledge to design an application that assesses a user's condition and provides appropriate assistance.
* Managed team responsibilities, including task delegation and technical support.

**Markov Chain Text Generation - BLEU Evaluation (Python and Databricks)** | [GitHub Repository](https://github.com/olagraczyk/Markov-Chain-Text-Generation-BLEU-Evaluation)


## Skills

Python, SQL, GCP (Vertex AI), PySpark, Pandas, PyTorch/TensorFlow (currently learning),  Figma,  Streamlit,  Prompt Engineering,  Data Analysis,  Machine Learning,  MLOps (learning), Communication (English C1).

# Remember to check the actual output in CV markdown since models can write incorrect information!