
#### 1. Initial Summarization Using Ollama
 
The journey began with a foundational step: analyzing the content of Kamatech Solutions' website.
Leveraging [Ollama](https://ollama.com/) a free and no paid API that runs locally on your computer, two language models were employed to extract concise and informative summaries of the website's content.
The focus was on capturing the essence of Kamatech solutions' products or services, their value proposition, their target and industry focus, as well as any notable aspects or their approach and methodology.
        
#### Stepping Into LLM Engineering

This initial analysis served as a critical starting point for a broader LLM Engineering journey.
Future steps will include building advanced models, customizing them to solve domain-specific problems, refining natural language understanding capabilities, and expanding multilingual features even further.

In [1]:
# Import Libraries

import requests
import subprocess
import json

In [2]:
# Fetch the website content
response = requests.get('https://www.kamatechsolutions.com')
website_content = response.text

# Create a JSON payload for Ollama API
payload = {
    "model": "mistral",
    "prompt": f"Summarize the following website content in 3-8 sentences: {website_content}"
}

try:
    # Call Ollama API (assuming it's running locally on default port)
    ollama_response = requests.post('http://localhost:11434/api/generate', json=payload)
    
    # Check if response is valid
    if ollama_response.status_code == 200:
        # Parse the streaming response
        response_text = ""
        for line in ollama_response.text.strip().split('\n'):
            if line:
                try:
                    response_json = json.loads(line)
                    if 'response' in response_json:
                        response_text += response_json['response']
                except json.JSONDecodeError:
                    continue
        
        summary = response_text if response_text else "No summary generated"
    else:
        summary = f"Error: Received status code {ollama_response.status_code}"
except Exception as e:
    summary = f"Error: {str(e)}"

print("Website Summary:")
print(summary)

Website Summary:
 The provided HTML code appears to be a webpage that displays a blog post list with an option to load more posts when the user clicks a "Load More" button. The JavaScript code handles the AJAX request for loading more posts and adjusts the UI accordingly. There's also some CSS to style the page, such as adding a loading state when new content is being fetched. Overall, it looks like an optimized blog layout using AJAX to improve user experience by only loading the initial set of posts and allowing users to fetch more as needed.


### Interpretation: 

The above summary we received is focused on the website's technical implementation rather than its actual content. This often happens when the the model analyzes the raw HTML, CSS or JavaScript codes instead of the website text content that human visitors would see.

**Here is the approch to improve the summary quality**

In [3]:
# Import Libraries

import requests
import logging
from typing import Optional

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Constants
OLLAMA_API_URL = 'http://localhost:11434/api/generate'  # Default Ollama API endpoint
DEFAULT_MODEL = 'mistral'  # Use mistral as the default model
DEFAULT_MAX_TOKENS = 200  # Adjust as needed
DEFAULT_TEMPERATURE = 0.5  # Adjust as needed
REQUEST_TIMEOUT = 200  # seconds (Ollama can take longer to respond)

def call_ollama_api(prompt: str, model: str = DEFAULT_MODEL, max_tokens: int = DEFAULT_MAX_TOKENS, temperature: float = DEFAULT_TEMPERATURE) -> Optional[str]:
    try:
        data = {
            'model': model,
            'prompt': prompt,
            'max_tokens': max_tokens,
            'temperature': temperature,
            'stream': False  # Set to False to get a single response
        }

        # Send the request to the Ollama API
        response = requests.post(OLLAMA_API_URL, json=data, timeout=REQUEST_TIMEOUT)
        response.raise_for_status()

        # Extract the response content
        result = response.json()
        return result.get('response', '').strip()

    except requests.exceptions.RequestException as e:
        logger.error(f"Request error calling Ollama API: {e}")
    except KeyError as e:
        logger.error(f"Key error in parsing Ollama API response: {e}")
    except Exception as e:
        logger.error(f"Unexpected error calling Ollama API: {e}")

    return None

# Example usage
if __name__ == "__main__":
    prompt =  f"""Extract the following information from https://kamatechsolutions.com:
        1. Company name and business type
        2. Main products / services
        3. Value proposition and USP
        4. Target audience / industry
        5. Notable methodologies/approaches

        Present as concise bullet points. Only include explicitly stated facts.
        """
    summary = call_ollama_api(prompt)
    if summary:
        logger.info(f"Summary: {summary}")
    else:
        logger.error("Failed to generate summary.")

INFO:__main__:Summary: 1. Company Name and Business Type:
     - Name: KamaTech Solutions
     - Business Type: IT Consulting Firm

  2. Main Products / Services:
     - Custom Software Development
     - Mobile Application Development
     - Web Application Development
     - DevOps Services
     - Quality Assurance and Testing
     - UI/UX Design

  3. Value Proposition and USP:
     - Delivering high-quality IT solutions that meet client needs and expectations.
     - Combining deep technical expertise with strong business acumen to provide tailored solutions.
     - A commitment to innovation, quality, and customer satisfaction.

  4. Target Audience / Industry:
     - Businesses across various industries in need of custom IT solutions.
     - Specific industries mentioned: Healthcare, Finance, Retail, Education, and Real Estate.

  5. Notable Methodologies/Approaches:
     - Agile Development Methodology
     - DevOps Practices
     - Lean UX Design Principles
     - Scrum Framewo

#### Leveraging BeautiulSoup for webscraping:
- Use beautifulSoup to properly extract content from the HTML.

In [4]:
import requests
from bs4 import BeautifulSoup
import time
import json

def summarize_website(url, model="mistral"):
    """Fetch website content and generate a summary using Ollama API"""
    print(f"Processing {url}...")
    start_time = time.time()
    
    # Fetch and parse website content
    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.text, 'html.parser')
        
        # Remove non-content elements
        for element in soup(['script', 'style', 'meta', 'link', 'noscript', 'iframe']):
            element.decompose()
        
        # Extract visible text (limit to 8000 chars if needed)
        text = soup.get_text(separator=' ', strip=True)
        text = text[:8000] + "..." if len(text) > 8000 else text
        
        print(f"- Content extracted ({len(text)} chars) in {time.time() - start_time:.2f}s")
        
        # Generate summary using Ollama
        prompt = f"""
        Based on this website content, summarize:
        1. Company name and business type
        2. Main products or services offered
        3. Value proposition
        4. Target audience
        5. Notable methodologies

        Present as concise bullet points. Only include explicitly stated facts.
        
        Website content: {text}
        """
        
        ollama_start = time.time()
        
        # Use stream=True to handle streaming response properly
        response = requests.post(
            'http://localhost:11434/api/generate', 
            json={"model": model, "prompt": prompt},
            stream=True
        )
        
        # Extract the summary from streaming response
        summary = ""
        if response.status_code == 200:
            for line in response.iter_lines():
                if line:
                    try:
                        data = json.loads(line)
                        if 'response' in data:
                            summary += data['response']
                    except json.JSONDecodeError:
                        continue
        else:
            summary = f"Error: API returned status code {response.status_code}"
        
        print(f"- Summary generated in {time.time() - ollama_start:.2f}s")
        
        return summary
    
    except Exception as e:
        return f"Error: {str(e)}"

if __name__ == "__main__":
    total_start = time.time()
    summary = summarize_website('https://www.kamatechsolutions.com')
    print(f"\nTotal time: {time.time() - total_start:.2f}s")
    print("\nSUMMARY:")
    print(summary)

Processing https://www.kamatechsolutions.com...
- Content extracted (8003 chars) in 1.41s
- Summary generated in 263.04s

Total time: 264.45s

SUMMARY:
1. Company name: Kamatech Solutions
    2. Main products or services offered: Advanced Analytics consulting services, including Data Science, Machine Learning, and Artificial Intelligence.
    3. Value proposition: Unload your analytics burden to KAMA-TECH SOLUTIONS, where the project is their priority. They provide workable data-driven IT business solutions, using the latest technology equipment and offering big data analytics, business intelligence (BI), data analytics, I business analytics, database and system design, etc.
    4. Service regions: Worldwide (as specified in the country list provided)
    5. Contact details: Email: info@kamatechsolutions.com; Phone: (281) 676-3571


---------------------------------------------------------------------------------------------------------------

### Key takeaways :
- The model uses **[mistral](https://ollama.com/search)** which is a smaller,faster model and the one of the most popular open-source models available in **Ollama**.
- No API Key is needed: **Ollama** runs locally, so no API Key is required.
- Customization: You can adjust the max_tokens and temperature parameters to control the response lengh and creativity.

### Alternative model in Ollama :
If you want to experiment with other models, you can replace **mistral** with one of the following:
- gemma3 : the current, most capable model that runs on a single GPU
- vicuna : a fine-tuned version of llama for conversational tasks
- llama3 : The most capable openly available LLM to date

To use a different model, make sure the model is isntalled loccally using Ollama. Simply pull it using ollama pull **model name** (e.g **ollama pull llama3**) and update the **model** constant in the code.
