In [1]:
import setup

setup.init_django()

In [2]:
from rag import (
    db as rag_db, 
    engines as rag_engines,
    settings as rag_settings, 
    updaters as rag_updaters,
    patches as rag_patches,
)

In [3]:
from typing import Optional, Union
from sqlalchemy import create_engine, text

In [4]:
rag_settings.init()
rag_db.init_vector_db()
rag_updaters.update_llama_index_documents(use_saved_embeddings=True)

In [5]:
vector_index = rag_engines.get_semantic_query_index()
semantic_query_retriever = rag_engines.get_semantic_query_retriever_engine()
sql_query_engine = rag_engines.get_sql_query_engine()

In [6]:
print(rag_settings.VECTOR_DB_NAME, rag_settings.VECTOR_DB_TABLE_NAME)

vector_db blogpost


In [7]:
from llama_index.core.tools import QueryEngineTool

vector_tool = QueryEngineTool.from_defaults(
    query_engine=semantic_query_retriever,
    description=(
        f"Useful for answering semantic questions about different blog posts"
    ),
)

In [8]:
sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,
    description=(
        "Useful for translating a natural language query into a SQL query over"
        " a table containing: blog posts and page views each blog post"
    ),
)

In [9]:
query_engine = rag_patches.MySQLAutoVectorQueryEngine(
    sql_tool, 
    vector_tool,
)

In [10]:
response = query_engine.query(
    "What do you make?"
)

[1;3;34mQuerying SQL database: The question 'What do you make?' is more likely to be answered by translating it into a SQL query to retrieve data from a table containing blog posts and page views.
[1;3;33mSQL query: SELECT title, content
FROM data_blogpost
ORDER BY timestamp DESC
LIMIT 1;
[0m[1;3;33mSQL response: Based on the latest blog post, the author discusses thoughts on the phrase "I’m bored." The content suggests that feeling bored can be beneficial, especially for those under 14, as it can prompt forward motion and encourage self-entertainment. The author views boredom as an opportunity to use empty space creatively.
[1;3;34mTransformed query given SQL response: Based on the original question "What do you make?" and the SQL response, the author's latest blog post discusses thoughts on the phrase "I’m bored." To fully answer the original question, we need more specific information about what the author creates or produces. Therefore, a more specific question could be:

New 

In [11]:
response.response

'Based on the information gathered from both the SQL query and the vector store query, here\'s a synthesized response to the original question "What do you make?":\n\nThe author creates a variety of content and is involved in several educational initiatives. Specifically, the author writes blog posts that cover a range of topics, including decision-making, cultural progress, resource management, and the differences between education and learning. For instance, in their latest blog post, the author discusses the phrase "I’m bored," suggesting that boredom can be beneficial, especially for those under 14, as it can prompt forward motion and encourage self-entertainment.\n\nAdditionally, the author is involved in running educational programs. One notable program is the altMBA, which focuses on learning and leadership. This indicates that the author\'s work revolves around creating educational content and programs that foster personal growth and development.'

In [16]:
response = query_engine.query(
    "What are the top 5 most viewed blog posts? What keywords do their content have?"
)

[1;3;34mQuerying SQL database: The question asks for the top 5 most viewed blog posts and keywords in their content, which involves translating the query into a SQL query over a table containing blog posts and page views.
[1;3;33mSQL query: SELECT
    db.id,
    db.title,
    db.content,
    COUNT(ap.id) AS view_count
FROM
    data_blogpost db
JOIN
    analytics_pageview ap ON db.id = ap.post_id
GROUP BY
    db.id
ORDER BY
    view_count DESC
LIMIT 5;
[0m[1;3;33mSQL response: Based on the query results, the top 5 most viewed blog posts are:

1. **"Blog Post 1"** with 3,208 views.
   - Keywords: Harry, Here, before, you

2. **"Blog Post 2"** with 2,201 views.
   - Keywords: You, Here, before, Harry

3. **"Blog Post 3"** with 1,761 views.
   - Keywords: Harry, Was, not, Here

4. **"What kind of org?"** with 1,235 views.
   - Keywords: organization, systems, charts, boxes, organism, changes

5. **"Taking it very seriously"** with 1,125 views.
   - Keywords: Today, April, first, day, g

In [13]:
from IPython.display import Markdown, display

display(Markdown(response.response))

Based on the query results, the top 5 most viewed blog posts are:

1. **"Blog Post 1"** with the content: *"Harry Was Here before you"*. Keywords include:
   - Harry
   - Was
   - Here
   - before
   - you

2. **"Blog Post 2"** with the content: *"You Were Here before Harry"*. Keywords include:
   - You
   - Were
   - Here
   - before
   - Harry

3. **"Blog Post 3"** with the content: *"Harry Was not Here"*. Keywords include:
   - Harry
   - Was
   - not
   - Here

4. **"What kind of org?"** with the content: *"Maybe you work with an organization. They have systems and charts and boxes.But the very nature of an organization is that someone developed it, figured it out and has to approve its changes. After all, it’s organized.Perhaps you work with an organism instead. An organism constantly changes. The..."*. Keywords include:
   - organization
   - systems
   - charts
   - boxes
   - developed
   - organized
   - organism
   - changes

5. **"Taking it very seriously"** with the content: *"Today, April first, was the day for a particular greeting, the only one except New Year’s that’s simply based on the date. Happy.It was a day that people on the internet understood that it’s possible to act as if and to do it with a smile. To pretend that we’re not on the brink of apocalypse of..."*. Keywords include:
   - April
   - first
   - day
   - greeting
   - New Year’s
   - date
   - Happy
   - internet
   - act
   - smile
   - apocalypse

These keywords are derived directly from the content of each blog post.

In [17]:
response = query_engine.query(
    "What are the top 5 least viewed blog posts in the year 2024 to 2025?"
)
print(response.response)

[1;3;34mQuerying SQL database: The question requires translating a natural language query into a SQL query to determine the top 5 least viewed blog posts within a specific time frame, which aligns with the capability described in choice (1).
[1;3;33mSQL query: SELECT
    data_blogpost.id,
    data_blogpost.title,
    COUNT(analytics_pageview.id) AS view_count
FROM
    data_blogpost
LEFT JOIN
    analytics_pageview
ON
    data_blogpost.id = analytics_pageview.post_id
WHERE
    analytics_pageview.timestamp BETWEEN '2024-01-01' AND '2025-12-31'
GROUP BY
    data_blogpost.id,
    data_blogpost.title
ORDER BY
    view_count ASC
LIMIT 5;
[0m[1;3;33mSQL response: Based on the query results, the top 5 least viewed blog posts from the year 2024 to 2025 are as follows:

1. **"Forward"** with 161 views.
2. **"GenC"** with 206 views.
3. **"Toward better"** with 230 views.
4. **"Bulletins vs bulletin boards"** with 250 views.
5. **"Generous isn’t always the same as free"** with 261 views.

Thes

In [18]:
display(Markdown(response.response))

Based on the query results, the top 5 least viewed blog posts from the year 2024 to 2025 are as follows:

1. **"Forward"** with 161 views.
2. **"GenC"** with 206 views.
3. **"Toward better"** with 230 views.
4. **"Bulletins vs bulletin boards"** with 250 views.
5. **"Generous isn’t always the same as free"** with 261 views.

These blog posts had the lowest view counts within the specified timeframe.