In [1]:
!pip install google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client -q
!pip install python-dotenv -q

## Connecting to Docks

In [7]:
import os
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from dotenv import load_dotenv

# I only want read access
SCOPES = ['https://www.googleapis.com/auth/documents.readonly']

# Paths to token and credits
#Client secret json is downloaded from my Google Cloud Console
CREDS_PATH = 'auth/client_secret.json'
# Token json will be auto-created after you authorize your app the first time
TOKEN_PATH = 'auth/token.json'

def get_credentials():
    creds = None
    if os.path.exists(TOKEN_PATH):
        creds = Credentials.from_authorized_user_file(TOKEN_PATH, SCOPES)
    else:
        flow = InstalledAppFlow.from_client_secrets_file(CREDS_PATH, SCOPES)
        creds = flow.run_local_server(port=0)
        with open(TOKEN_PATH, 'w') as token:
            token.write(creds.to_json())
    return creds

def read_google_doc(doc_id):
    creds = get_credentials() #Makes sure I am authenticated
    service = build('docs', 'v1', credentials=creds) #This connects to Docs API
    doc = service.documents().get(documentId=doc_id).execute() #Get the content
    
    content = ""
    for element in doc.get("body").get("content"):
        if "paragraph" in element:
            for text_run in element["paragraph"].get("elements", []):
                if "textRun" in text_run:
                    content += text_run["textRun"]["content"]
    return content

#Loads the environment variables, where I have saved the Doc Id
load_dotenv()
doc_id = os.getenv("GOOGLE_DOC_ID")

App will read the document every time I run it. 

Content could be cached and stored in a file, but with 8 pages it only took about 4 seconds, and adds additional value, that content is up to date, even if i make changes between uses of the app. Example on how to save the cached text locally is down below

In [8]:
experience_text = read_google_doc(GOOGLE_DOC_ID)

#print some of the content to show that this is working
print(experience_text[:1000])


Linkedin: https://www.linkedin.com/in/ola-graczyk/
 
1.      Work experience
1.1   UX/UI Designer
Time: October 2024  - November 2024 ( 2 months )
Company: DynamicoAI (Now Firemind)
About the company: DynamicoAI owned a product called Enhanced IQ. It is an enterprise-grade platform that makes it simple to scale generative AI across the organisation. The platform can be integrated with the most popular databases and systems and sensitive data is protected. It can be used by employees on all levels for example to search company sharepoint,  retrieve data, write structured prompts to make working faster.
About my role:
As part of my role at DynamicoAI, I was responsible for evaluating the Enhanced IQ platform with a fresh perspective, identifying potential issues, and suggesting improvements. My key responsibilities included:
-        Conducted a thorough review of the platform, documenting all observations, usability concerns, and bugs.
-        Compiled findings into a structured report

In [5]:
# Save locally
with open('cached_experience.txt', 'w', encoding='utf-8') as f:
    f.write(experience_text)

# Later, you can read from file without hitting Google API
# with open('cached_experience.txt', 'r', encoding='utf-8') as f:
#     experience_text = f.read()

## Scrape the web


In [3]:
!pip install requests beautifulsoup4 -q
from web_scraper import scrape

In [7]:
url = input("Paste the link to job description: ")

scrape(url)

with open('job_description.txt', 'r', encoding='utf-8') as f:
    job_description = f.read()

print(job_description[300:600])

Paste the link to job description:  https://justjoin.it/job-offer/allegro-junior-machine-learning-engineer---e-xperience-associate-warszawa-ai


w
AI/ML
Junior Machine Learning Engineer - e-Xperience Associate
Allegro
Warszawa
Type of work
Full-time
Experience
Junior
Employment Type
Permanent
Operating mode
Hybrid
Allegro
At Allegro, we build and maintain some of the most distributed and scalable applications in Central Europe. Work with us 
