# Use Case Description

Using public job postings of engineers as OSINT is a common practice. <br>
In this use case, we will analyze a hypothetical DevOps engineer job description to uncover potential exploitation scenarios.

## Model used for this use case
Reasoning Model is well-suited for this use case because the task is complex and requires deep reasoning. In this example, we used the Reasoning Model via SageMaker endpoint.

**Note**: Update the configuration variables below to match your deployment.

## Configuration
Update these variables to match your SageMaker deployment:

In [None]:
# Update these variables to match your deployment
endpoint_name = 'foundation-sec-8b-endpoint'
aws_region = 'us-east-1'

print(f"Configuration:")
print(f"Endpoint: {endpoint_name}")
print(f"Region: {aws_region}")

## SetUp

The setup uses SageMaker endpoint instead of loading the model locally.

### Notice
- This notebook uses SageMaker endpoints for scalable inference instead of local model hosting.
- The reasoning model is accessed via API calls, providing better resource management and scalability.
- Outputs may vary slightly due to the distributed nature of the service, even with consistent parameters.

In [None]:
import boto3
import json
import re
from IPython.display import display, Markdown

# Initialize SageMaker runtime client
sagemaker_runtime = boto3.client('sagemaker-runtime', region_name=aws_region)

print(f"Connected to SageMaker endpoint: {endpoint_name}")

In [None]:
# Generation arguments for detailed reasoning
generation_args = {
    "max_new_tokens": 2048,
    "temperature": None,  # None means deterministic (temperature=0)
    "repetition_penalty": 1.2,
    "do_sample": False,   # Deterministic sampling for consistent reasoning
    "use_cache": True
}

print("Generation configuration:")
for key, value in generation_args.items():
    print(f"  {key}: {value}")

In [None]:
def inference(prompt):
    """Inference function using SageMaker endpoint for reasoning tasks"""
    
    # Format the conversation for reasoning model
    formatted_prompt = f"User: {prompt}\n\nAssistant: <think>\n"
    
    # Prepare payload for SageMaker endpoint
    payload = {
        "inputs": formatted_prompt,
        "parameters": generation_args
    }
    
    try:
        response = sagemaker_runtime.invoke_endpoint(
            EndpointName=endpoint_name,
            ContentType='application/json',
            Body=json.dumps(payload)
        )
        
        result = json.loads(response['Body'].read().decode())
        
        # Handle different TGI response formats
        if isinstance(result, list) and len(result) > 0:
            generated_text = result[0].get('generated_text', '')
        elif isinstance(result, dict):
            generated_text = result.get('generated_text', str(result))
        else:
            generated_text = str(result)
        
        # Clean up the response (remove the original prompt if it's included)
        if generated_text.startswith(formatted_prompt):
            response_text = generated_text[len(formatted_prompt):].strip()
        else:
            response_text = generated_text.strip()
        
        # Extract thinking part for reasoning models
        # Look for content between <think> and </think> or end of text
        think_match = re.search(r'<think>(.*?)(?:</think>|$)', response_text, re.DOTALL)
        if think_match:
            thinking_content = think_match.group(1).strip()
            return thinking_content
        
        # Remove any trailing special tokens
        response_text = re.sub(r'<\|.*?\|>$', '', response_text).strip()
        
        return response_text
        
    except Exception as e:
        print(f"Error invoking endpoint: {str(e)}")
        return f"Error: {str(e)}"

# Test the inference function
test_response = inference("Hello, can you help with OSINT analysis?")
print("Test Response:")
print(test_response[:200] + "..." if len(test_response) > 200 else test_response)
print("\nSageMaker inference function ready!")

## Generate a mock job description as OSINT to be analyzed

Let's first create a mock job description, which can also be generated using our reasoning model. <br>
If you have a specific job description in mind or want to use a real one found online, replace the ```job_description``` variable with your data.

In [None]:
prompt = '''You are a recruiter at Cisco, hiring a Senior DevOps Engineer for the Security Products team in San Jose. 
Write a comprehensive and detailed job description for this role in Markdown format.'''

print("Generating job description using SageMaker endpoint...")
job_description = inference(prompt)
print("Job description generated successfully!\n")
display(Markdown(job_description))

### Analyze the job description to find potential exploitation scenarios

Let's have the model analyze the job description and generate a report. <br>
It's recommended to provide the model with a clear format to follow for consistency and clarity.

In [None]:
analysis_prompt = f'''You are a cybersecurity expert outside Cisco and wants to run some tests against their products only using public information.
Based on your knowledge and experience of OSINT, analyze the job description below and write a report of potential exploitation for penetration tests in the following formats.

##Summary
##Technical Findings
##Potential exploitations


<Job Description>
{job_description}
'''

print("Analyzing job description for OSINT intelligence...")

In [None]:
response = inference(analysis_prompt)
print("OSINT analysis complete!\n")
display(Markdown(response))

## Advanced OSINT Analysis

Let's perform additional analysis focusing on specific attack vectors:

In [None]:
# Supply Chain Attack Analysis
supply_chain_prompt = f'''Based on the job description below, analyze potential supply chain attack vectors that could be identified through OSINT. 
Focus on:
1. Third-party dependencies and tools mentioned
2. Development workflow vulnerabilities
3. Infrastructure dependencies
4. Vendor relationships that could be exploited

Provide specific recommendations for red team exercises.

<Job Description>
{job_description}
'''

print("=== SUPPLY CHAIN ATTACK ANALYSIS ===")
supply_chain_analysis = inference(supply_chain_prompt)
display(Markdown(supply_chain_analysis))

In [None]:
# Add your own job description here for analysis
custom_job_description = """
# Paste your job description here
# Example:
# Senior Cloud Security Engineer
# We are seeking a Senior Cloud Security Engineer to join our team...
"""

if custom_job_description.strip() and not custom_job_description.startswith("# Paste your"):
    print("=== CUSTOM JOB DESCRIPTION ANALYSIS ===")
    
    custom_prompt = f'''Analyze this job description from an OSINT and penetration testing perspective.
    
Focus on:
- Cloud security vulnerabilities
- Container orchestration risks
- API security weaknesses
- Identity and access management gaps
- Compliance framework exposures

Provide actionable intelligence that could be used in authorized security assessments.

<Job Description>
{custom_job_description}
'''
    
    custom_analysis = inference(custom_prompt)
    display(Markdown(custom_analysis))
else:
    print("💡 Add your own job description in the custom_job_description variable above to analyze it!")