bhaverland · bhaverland · Mar 22, 2024 · Mar 15, 2024 · Mar 15, 2024 · Mar 15, 2024
diff --git a/README.md b/README.md
@@ -5,7 +5,7 @@
 </h2>
 
 <p align="center">
-<p align="center">Open Source Unified Search and Gen-AI Chat with your Docs.</p>
+<p align="center">Open Source Gen-AI Chat + Unified Search.</p>
 
 <p align="center">
 <a href="https://docs.danswer.dev/" target="_blank">
@@ -22,16 +22,16 @@
 </a>
 </p>
 
-<strong>[Danswer](https://www.danswer.ai/)</strong> lets you ask questions in natural language and get back
-answers based on your team specific documents. Think ChatGPT if it had access to your team's unique
-knowledge. Connects to all common workplace tools such as Slack, Google Drive, Confluence, etc.
+<strong>[Danswer](https://www.danswer.ai/)</strong> is the ChatGPT for teams. Danswer provides a Chat interface and plugs into any LLM of
+your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own
+the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be
+modular and easily extensible. The system also comes fully ready for production usage with user authentication, role
+management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts.
 
-Teams have used Danswer to:
-- Speedup customer support and escalation turnaround time.
-- Improve Engineering efficiency by making documentation and code changelogs easy to find.
-- Let sales team get fuller context and faster in preparation for calls.
-- Track customer requests and priorities for Product teams.
-- Help teams self-serve IT, Onboarding, HR, etc.
+Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc.
+By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if
+it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already
+supported?" or "Where's the pull request for feature Y?"
 
 <h3>Usage</h3>
 
@@ -57,19 +57,27 @@ We also have built-in support for deployment on Kubernetes. Files for that can b
 
 
 ## 💃 Main Features 
+* Chat UI with the ability to select documents to chat with.
+* Create custom AI Assistants with different prompts and backing knowledge sets.
+* Connect Danswer with LLM of your choice (self-host for a fully airgapped solution).
 * Document Search + AI Answers for natural language queries.
 * Connectors to all common workplace tools like Google Drive, Confluence, Slack, etc.
-* Chat support (think ChatGPT but it has access to your private knowledge sources).
-* Create custom AI Assistants with different prompts and backing knowledge sets.
 * Slack integration to get answers and search results directly in Slack.
 
 
+## 🚧 Roadmap
+* Chat/Prompt sharing with specific teammates and user groups.
+* Multi-Model model support, chat with images, video etc.
+* Choosing between LLMs and parameters during chat session.
+* Tool calling and agent configurations options.
+* Organizational understanding and ability to locate and suggest experts from your team.
+
+
 ## Other Noteable Benefits of Danswer
-* Best in class Hybrid Search across all sources (BM-25 + prefix aware embedding models).
 * User Authentication with document level access management.
+* Best in class Hybrid Search across all sources (BM-25 + prefix aware embedding models).
 * Admin Dashboard to configure connectors, document-sets, access, etc.
 * Custom deep learning models + learn from user feedback.
-* Connect Danswer with LLM of your choice for a fully airgapped solution.
 * Easy deployment and ability to host Danswer anywhere of your choosing.
 
 
@@ -96,11 +104,5 @@ Efficiently pulls the latest changes from:
   * Websites
   * And more ...
 
-## 🚧 Roadmap
-* Organizational understanding.
-* Ability to locate and suggest experts from your team.
-* Code Search
-* Structured Query Languages (SQL, Excel formulas, etc.)
-
 ## 💡 Contributing
 Looking to contribute? Please check out the [Contribution Guide](CONTRIBUTING.md) for more details.
diff --git a/backend/alembic/versions/173cae5bba26_port_config_store.py b/backend/alembic/versions/173cae5bba26_port_config_store.py
@@ -0,0 +1,29 @@
+"""Port Config Store
+
+Revision ID: 173cae5bba26
+Revises: e50154680a5c
+Create Date: 2024-03-19 15:30:44.425436
+
+"""
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+# revision identifiers, used by Alembic.
+revision = "173cae5bba26"
+down_revision = "e50154680a5c"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "key_value_store",
+        sa.Column("key", sa.String(), nullable=False),
+        sa.Column("value", postgresql.JSONB(astext_type=sa.Text()), nullable=False),
+        sa.PrimaryKeyConstraint("key"),
+    )
+
+
+def downgrade() -> None:
+    op.drop_table("key_value_store")
diff --git a/backend/alembic/versions/4738e4b3bae1_pg_file_store.py b/backend/alembic/versions/4738e4b3bae1_pg_file_store.py
@@ -0,0 +1,28 @@
+"""PG File Store
+
+Revision ID: 4738e4b3bae1
+Revises: e91df4e935ef
+Create Date: 2024-03-20 18:53:32.461518
+
+"""
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "4738e4b3bae1"
+down_revision = "e91df4e935ef"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "file_store",
+        sa.Column("file_name", sa.String(), nullable=False),
+        sa.Column("lobj_oid", sa.Integer(), nullable=False),
+        sa.PrimaryKeyConstraint("file_name"),
+    )
+
+
+def downgrade() -> None:
+    op.drop_table("file_store")
diff --git a/backend/alembic/versions/91fd3b470d1a_remove_documentsource_from_tag.py b/backend/alembic/versions/91fd3b470d1a_remove_documentsource_from_tag.py
@@ -0,0 +1,36 @@
+"""Remove DocumentSource from Tag
+
+Revision ID: 91fd3b470d1a
+Revises: 173cae5bba26
+Create Date: 2024-03-21 12:05:23.956734
+
+"""
+from alembic import op
+import sqlalchemy as sa
+from danswer.configs.constants import DocumentSource
+
+# revision identifiers, used by Alembic.
+revision = "91fd3b470d1a"
+down_revision = "173cae5bba26"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.alter_column(
+        "tag",
+        "source",
+        type_=sa.String(length=50),
+        existing_type=sa.Enum(DocumentSource, native_enum=False),
+        existing_nullable=False,
+    )
+
+
+def downgrade() -> None:
+    op.alter_column(
+        "tag",
+        "source",
+        type_=sa.Enum(DocumentSource, native_enum=False),
+        existing_type=sa.String(length=50),
+        existing_nullable=False,
+    )
diff --git a/backend/alembic/versions/e50154680a5c_no_source_enum.py b/backend/alembic/versions/e50154680a5c_no_source_enum.py
@@ -0,0 +1,38 @@
+"""No Source Enum
+
+Revision ID: e50154680a5c
+Revises: fcd135795f21
+Create Date: 2024-03-14 18:06:08.523106
+
+"""
+from alembic import op
+import sqlalchemy as sa
+
+from danswer.configs.constants import DocumentSource
+
+# revision identifiers, used by Alembic.
+revision = "e50154680a5c"
+down_revision = "fcd135795f21"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.alter_column(
+        "search_doc",
+        "source_type",
+        type_=sa.String(length=50),
+        existing_type=sa.Enum(DocumentSource, native_enum=False),
+        existing_nullable=False,
+    )
+    op.execute("DROP TYPE IF EXISTS documentsource")
+
+
+def downgrade() -> None:
+    op.alter_column(
+        "search_doc",
+        "source_type",
+        type_=sa.Enum(DocumentSource, native_enum=False),
+        existing_type=sa.String(length=50),
+        existing_nullable=False,
+    )
diff --git a/backend/alembic/versions/e91df4e935ef_private_personas_documentsets.py b/backend/alembic/versions/e91df4e935ef_private_personas_documentsets.py
@@ -0,0 +1,118 @@
+"""Private Personas DocumentSets
+
+Revision ID: e91df4e935ef
+Revises: 91fd3b470d1a
+Create Date: 2024-03-17 11:47:24.675881
+
+"""
+import fastapi_users_db_sqlalchemy
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "e91df4e935ef"
+down_revision = "91fd3b470d1a"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "document_set__user",
+        sa.Column("document_set_id", sa.Integer(), nullable=False),
+        sa.Column(
+            "user_id",
+            fastapi_users_db_sqlalchemy.generics.GUID(),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(
+            ["document_set_id"],
+            ["document_set.id"],
+        ),
+        sa.ForeignKeyConstraint(
+            ["user_id"],
+            ["user.id"],
+        ),
+        sa.PrimaryKeyConstraint("document_set_id", "user_id"),
+    )
+    op.create_table(
+        "persona__user",
+        sa.Column("persona_id", sa.Integer(), nullable=False),
+        sa.Column(
+            "user_id",
+            fastapi_users_db_sqlalchemy.generics.GUID(),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(
+            ["persona_id"],
+            ["persona.id"],
+        ),
+        sa.ForeignKeyConstraint(
+            ["user_id"],
+            ["user.id"],
+        ),
+        sa.PrimaryKeyConstraint("persona_id", "user_id"),
+    )
+    op.create_table(
+        "document_set__user_group",
+        sa.Column("document_set_id", sa.Integer(), nullable=False),
+        sa.Column(
+            "user_group_id",
+            sa.Integer(),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(
+            ["document_set_id"],
+            ["document_set.id"],
+        ),
+        sa.ForeignKeyConstraint(
+            ["user_group_id"],
+            ["user_group.id"],
+        ),
+        sa.PrimaryKeyConstraint("document_set_id", "user_group_id"),
+    )
+    op.create_table(
+        "persona__user_group",
+        sa.Column("persona_id", sa.Integer(), nullable=False),
+        sa.Column(
+            "user_group_id",
+            sa.Integer(),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(
+            ["persona_id"],
+            ["persona.id"],
+        ),
+        sa.ForeignKeyConstraint(
+            ["user_group_id"],
+            ["user_group.id"],
+        ),
+        sa.PrimaryKeyConstraint("persona_id", "user_group_id"),
+    )
+
+    op.add_column(
+        "document_set",
+        sa.Column("is_public", sa.Boolean(), nullable=True),
+    )
+    # fill in is_public for existing rows
+    op.execute("UPDATE document_set SET is_public = true WHERE is_public IS NULL")
+    op.alter_column("document_set", "is_public", nullable=False)
+
+    op.add_column(
+        "persona",
+        sa.Column("is_public", sa.Boolean(), nullable=True),
+    )
+    # fill in is_public for existing rows
+    op.execute("UPDATE persona SET is_public = true WHERE is_public IS NULL")
+    op.alter_column("persona", "is_public", nullable=False)
+
+
+def downgrade() -> None:
+    op.drop_column("persona", "is_public")
+
+    op.drop_column("document_set", "is_public")
+
+    op.drop_table("persona__user")
+    op.drop_table("document_set__user")
+    op.drop_table("persona__user_group")
+    op.drop_table("document_set__user_group")
diff --git a/backend/danswer/background/celery/celery.py b/backend/danswer/background/celery/celery.py
@@ -226,8 +226,4 @@ def clean_old_temp_files_task(
         "task": "check_for_document_sets_sync_task",
         "schedule": timedelta(seconds=5),
     },
-    "clean-old-temp-files": {
-        "task": "clean_old_temp_files_task",
-        "schedule": timedelta(minutes=30),
-    },
 }
diff --git a/backend/danswer/configs/app_configs.py b/backend/danswer/configs/app_configs.py
@@ -224,8 +224,8 @@
 #####
 # Miscellaneous
 #####
-DYNAMIC_CONFIG_STORE = os.environ.get(
-    "DYNAMIC_CONFIG_STORE", "FileSystemBackedDynamicConfigStore"
+DYNAMIC_CONFIG_STORE = (
+    os.environ.get("DYNAMIC_CONFIG_STORE") or "PostgresBackedDynamicConfigStore"
 )
 DYNAMIC_CONFIG_DIR_PATH = os.environ.get("DYNAMIC_CONFIG_DIR_PATH", "/home/storage")
 JOB_TIMEOUT = 60 * 60 * 6  # 6 hours default

diff --git a/backend/danswer/connectors/confluence/rate_limit_handler.py b/backend/danswer/connectors/confluence/rate_limit_handler.py
@@ -10,6 +10,9 @@
 F = TypeVar("F", bound=Callable[..., Any])
 
 
+RATE_LIMIT_MESSAGE_LOWERCASE = "Rate limit exceeded".lower()
+
+
 class ConfluenceRateLimitError(Exception):
     pass
 
@@ -27,7 +30,10 @@ def wrapped_call(*args: list[Any], **kwargs: Any) -> Any:
         try:
             return confluence_call(*args, **kwargs)
         except HTTPError as e:
-            if e.response.status_code == 429:
+            if (
+                e.response.status_code == 429
+                or RATE_LIMIT_MESSAGE_LOWERCASE in e.response.text.lower()
+            ):
                 raise ConfluenceRateLimitError()
             raise