In [1]:
import os
import datetime
import json
import jieba
from collections import namedtuple
from langchain_google_genai import ChatGoogleGenerativeAI, HarmBlockThreshold, HarmCategory
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from IPython.display import display
from IPython.display import Markdown

In [2]:
# Set up the model
generation_config = {
  "temperature": 0.0,
  "top_p": 1,
  "top_k": 32,
  "max_output_tokens": 4096,
}

# Corrected safety settings format
safety_settings = {
    HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
}

In [3]:

model = ChatGoogleGenerativeAI(
    model="gemini-pro",
    generation_config=generation_config,
    safety_settings=safety_settings,
    google_api_key="AIzaSyCDqn8xVJ4cFeiXSvhPUcnR60jfBLj5dO4"
)

# Define the role, objective, focus, restrictions, provided data, and starting work
# Role Name: Criminal profiler.
# Role Ojective: Create a psychological profile based on browsing history.
# Role Focus: Motivations, psychological characteristics, behavioral patterns, relevant insights.
# Role Restrictions: Avoid identification or accusations, no legal advice.
# Provided Data: List of web pages visited with titles and timestamps.
# Starting Work: Asking the role to perform the task with the provided data.


# Create the prompt template
template = """
{role}\
{provided_data}\
{start}
"""
prompt = ChatPromptTemplate.from_template(template)

In [4]:
! wget -q https://raw.githubusercontent.com/frankwxu/digital-forensics-lab/main/AI4Forensics/CKIM2024/Takeout/role.txt
! wget -q https://raw.githubusercontent.com/frankwxu/digital-forensics-lab/main/AI4Forensics/CKIM2024/Takeout/titles_with_timestamp.txt
! wget -q https://raw.githubusercontent.com/frankwxu/digital-forensics-lab/main/AI4Forensics/CKIM2024/Takeout/start.txt

In [5]:
output_parser = StrOutputParser()
chain = prompt | model | output_parser

with open(r"role.txt", "r") as file:
    role = file.read()

with open(r"titles_with_timestamp.txt", "r") as file:
    provided_data = file.read()

with open(r"start.txt", "r") as file:
    start = file.read()

result = chain.invoke(
    {
        "role": role,
        "provided_data": provided_data,
        "start": start,
    }
)
Markdown(result)

**Psychological Profile of the Suspect**

**Possible Motivations:**

* **Academic or Career Advancement:** The suspect's browsing history suggests an interest in career development in the tech industry, particularly in building a successful career. This could indicate a desire for recognition, status, or financial gain.
* **Knowledge Acquisition:** The suspect has accessed resources on data science, browser history analysis, and human subject research. This may indicate a need to acquire specific knowledge or skills for academic or professional purposes.
* **Data Collection:** The repetitive searches for "internet browsers history" and "pull user browser history" suggest an interest in collecting or analyzing browsing data. This could be motivated by research, curiosity, or potential misuse.

**Psychological Characteristics:**

* **Curiosity and Intellectual Engagement:** The suspect's browsing history demonstrates an inquisitive nature and a desire to learn about different topics.
* **Methodical and Analytical:** The suspect has conducted systematic searches and explored multiple resources, indicating an organized and analytical approach to problem-solving.
* **Secrecy and Privacy Concerns:** The suspect has shown an interest in human subject research and ethical considerations related to data collection. This may reflect a concern for privacy or a desire to operate discreetly.

**Behavioral Patterns:**

* **Extensive Web Browsing:** The suspect has spent significant time browsing the internet, particularly focusing on topics related to technology, data science, and web browser history.
* **Targeted Searches:** The suspect has conducted specific searches for information on data collection techniques and browser history analysis.
* **Repeated Access:** The suspect has repeatedly accessed the same websites and resources, suggesting a consistent interest in these topics.

**Other Relevant Insights:**

* **Possible Academic Affiliation:** The suspect's access to the University of Baltimore's MyUB portal and Canvas platform suggests a possible affiliation with the university.
* **Interest in Technology and Data:** The browsing history reveals a strong interest in technology, data science, and data collection.
* **Potential Ethical Concerns:** The suspect's interest in human subject research and ethical considerations related to data collection may indicate an awareness of potential ethical implications of their actions.

**Caution:** It is important to note that this profile is based solely on the provided web history and does not account for other factors that may influence the suspect's behavior. A comprehensive psychological assessment would be necessary to fully understand the suspect's motivations and intentions.