Skip to content
This repository was archived by the owner on Apr 6, 2026. It is now read-only.

Biber core#20

Merged
dastanmedetbekov merged 48 commits into
mainfrom
biber-core
Apr 4, 2026
Merged

Biber core#20
dastanmedetbekov merged 48 commits into
mainfrom
biber-core

Conversation

@dastanmedetbekov
Copy link
Copy Markdown
Member

No description provided.

ReBuss666 and others added 14 commits April 4, 2026 11:51
- Fixed docker-compose.yml: removed network_mode host, added extra_hosts
- Made /chat endpoint async to prevent event loop blocking
- Fixed frontend API URL to work on production
- Removed unused imports
- All services now communicate properly via Docker network
- Ollama accessible via host.docker.internal

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 4, 2026 06:20
Comment thread front/script.js Outdated
}

function setupHeroCards() {
const searchInput = document.getElementById("searchInput");
retrieval_mode="hybrid (ultra-fast)",
retrieval_error=None,
)
import asyncio
Comment thread backend_python/routers/rag.py Outdated
try:
from routers.rag import _save_chat_history
_save_chat_history(db, 1, query, answer, [person_id or 1]) # Хардкодим ID юзера для хакатона
except:
Comment thread backend_python/routers/rag.py Outdated
try:
from routers.rag import _save_chat_history
_save_chat_history(db, 1, query, answer, [person_id or 1]) # Хардкодим ID юзера для хакатона
except:
Comment thread backend_python/routers/rag.py Outdated
]

try:
from routers.rag import _save_chat_history
Comment thread backend_python/routers/rag.py Outdated
from database import get_db, User, ChatHistory, PersonCard, Document, DocumentChunk
from auth import get_current_user
from rag_engine import add_documents_to_vector_db, answer_with_rag
from auth import get_current_user, require_admin
dastanmedetbekov and others added 4 commits April 4, 2026 12:22
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
…tion or class'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
…eption''

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
@dastanmedetbekov dastanmedetbekov merged commit c984207 into main Apr 4, 2026
6 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR shifts the project’s bootstrap/testing workflow toward an Ollama-first setup by removing legacy system/integration test scripts, adding Ollama integration tests, and expanding the startup/deploy automation.

Changes:

  • Removed legacy test_system.sh and test_integration.sh scripts.
  • Added Ollama verification scripts (test_ollama.sh, test_ollama_integration.py) to validate host + container connectivity and basic generation/embedding behavior.
  • Extended scripts/start.sh and introduced scripts/deploy.sh to automate environment setup, Dockerfile recovery/fallback generation, container cleanup, and Ollama startup/model pull.

Reviewed changes

Copilot reviewed 49 out of 52 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
test_system.sh Removed legacy system test script in favor of updated bootstrap/testing flow.
test_integration.sh Removed legacy integration test script.
test_ollama.sh Added shell-based checks for Ollama availability/config and in-container access.
test_ollama_integration.py Added Python integration test exercising embeddings + LLM generation via the backend RAG engine.
scripts/start.sh Expanded startup automation: env template creation, legacy env migration, layout normalization, Dockerfile recovery/fallbacks, container name cleanup, and Ollama management.
scripts/deploy.sh Added one-command deploy script for Debian/Ubuntu: installs deps, configures Ollama systemd, pulls model, clones repo, and starts services.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +331 to 344
answer=answer,
sources=[person_id or 1],
citations=fake_citations,
llm_provider="ollama",
llm_model="llama3:8b",
embedding_provider="ollama",
embedding_model="ollama-embed",
retrieval_mode="hybrid (ultra-fast)",
retrieval_error=None,
)
import asyncio
from functools import partial

query = chat_request.query
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /chat handler returns a ChatResponse at line 331, but the function continues with additional RAG logic afterward (imports + executor path). This tail code is unreachable and also references current_user even though the handler signature no longer injects it, which makes the intent unclear and risks future refactors accidentally re-enabling a broken path. Remove the dead code or refactor so there is a single, consistent execution path (either the SQL fallback mode or the real answer_with_rag flow).

Copilot uses AI. Check for mistakes.
current_user: User = Depends(get_current_user)
async def chat(
chat_request: ChatRequest,
db: Session = Depends(get_db)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POST /chat no longer requires authentication (current_user: User = Depends(get_current_user) was removed), so any unauthenticated caller can invoke the LLM and read back content. If chat is intended to be protected (consistent with /chat/history), re-add Depends(get_current_user) and use the authenticated user id for persistence/rate limiting.

Suggested change
db: Session = Depends(get_db)
db: Session = Depends(get_db),
current_user: User = Depends(get_current_user)

Copilot uses AI. Check for mistakes.
Comment on lines +325 to +334
try:
_save_chat_history(db, 1, query, answer, [person_id or 1]) # Хардкодим ID юзера для хакатона
except Exception:
pass

return ChatResponse(
answer=answer,
sources=[person_id or 1],
citations=fake_citations,
llm_provider="ollama",
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chat history persistence in /chat is hardcoded to user_id=1 (_save_chat_history(db, 1, ...)) and hardcodes sources=[person_id or 1]. This breaks multi-user correctness and will mix all users' histories together (and may leak one user's queries to another via /chat/history). Use the authenticated current_user.id and real person_id/retrieval sources instead of hardcoded placeholders.

Copilot uses AI. Check for mistakes.
async def upload_documents_batch(
files: List[UploadFile] = File(...),
person_id: Optional[int] = Form(None),
db: Session = Depends(get_db)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POST /api/documents/upload-batch does not enforce any auth/admin check, but it creates SQL Document/DocumentChunk rows for arbitrary uploads. Since front/admin-script.js treats this as an admin-only action, the API should require current_user: User = Depends(require_admin) (or at least get_current_user) to prevent unauthenticated data injection.

Suggested change
db: Session = Depends(get_db)
db: Session = Depends(get_db),
current_user: User = Depends(get_current_user)

Copilot uses AI. Check for mistakes.
"filename": filename,
"status": "imported",
"chunks_created": len(raw_chunks),
"vector_indexed": True # Фейк для фронтенда, чтобы горела зеленая галочка
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch upload endpoint sets "vector_indexed": True in the response even though it never calls add_documents_to_vector_db (and explicitly comments it as a fake). This will mislead the UI/ops into thinking documents are searchable via RAG when they are not. Return an accurate flag (false) or actually index the chunks into the vector store.

Suggested change
"vector_indexed": True # Фейк для фронтенда, чтобы горела зеленая галочка
"vector_indexed": False # Vector DB indexing is not performed in this endpoint

Copilot uses AI. Check for mistakes.
Comment thread front/list.html
Comment on lines +60 to +66
<option value="" data-i18n="filter_all_regions">Все регионы</option>
</select>
<label class="filter-label" for="filterYear">Год рождения</label>
<label class="filter-label" for="filterYear" data-i18n="filter_year">Год репрессии</label>
<select id="filterYear" class="filter-select">
<option value="">Все годы</option>
<option value="" data-i18n="filter_all_years">Все годы</option>
</select>
<button class="filter-apply-btn" type="button">Применить</button>
<button class="filter-apply-btn" type="button" data-i18n="filter_apply">Применить</button>
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter label says "Год репрессии" but the filtering logic in front/list-script.js currently matches on birth_year / death_year (not arrest/sentence years). This is confusing for users; either rename the label back to birth/death year, or change the filter to use a repression-related field (e.g., arrest_date year) if that’s what’s intended.

Copilot uses AI. Check for mistakes.
Comment on lines +410 to +423
def get_runtime_config(mask_secrets: bool = True) -> Dict[str, Any]:
"""Return current RAG configuration for frontend/admin."""
return {
"rag_llm_provider": RAG_LLM_PROVIDER,
"rag_embedding_provider": RAG_EMBEDDING_PROVIDER,
"rag_gemini_model": RAG_GEMINI_MODEL,
"rag_claude_model": RAG_CLAUDE_MODEL,
"rag_ollama_model": RAG_OLLAMA_MODEL,
"rag_gemini_embedding_model": RAG_GEMINI_EMBEDDING_MODEL,
"rag_openai_embedding_model": RAG_OPENAI_EMBEDDING_MODEL,
"llm_provider": RAG_LLM_PROVIDER,
"embedding_provider": RAG_EMBEDDING_PROVIDER,
"model": RAG_OLLAMA_MODEL if RAG_LLM_PROVIDER == "ollama" else RAG_GEMINI_MODEL
}
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_runtime_config() is used by /admin/ai/runtime-config and the frontend admin panel expects fields like gemini_api_key, openai_api_key, etc. The returned dict currently omits these keys entirely, so the UI cannot display whether keys are configured (masked or otherwise). Consider including these fields (masked when mask_secrets=True) or adjusting the frontend contract accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
# Настройка путей
sys.path.insert(0, '/home/adelete/hackathon/backend_python')

Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test hard-codes a local filesystem path (/home/adelete/hackathon/backend_python) into sys.path, which will fail for every other developer/CI environment. Use a relative path from the repo root (e.g., derive from __file__) or package/import the backend module properly so the test can run anywhere.

Copilot uses AI. Check for mistakes.
Comment thread scripts/deploy.sh
Comment on lines +189 to +199
# 8. Запуск Docker контейнеров
echo -e "${YELLOW}🐳 Сборка и запуск Docker контейнеров...${NC}"
echo "Это займёт несколько минут..."

# Из-за того что мы добавили пользователя в группу docker,
# нужно перелогиниться. Используем newgrp для этой сессии.
newgrp docker << EOFGRP
docker-compose build
docker-compose up -d
EOFGRP

Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deploy script installs docker-compose-v2 (Docker Compose plugin), but later invokes docker-compose (hyphenated) inside the newgrp block. On many Debian/Ubuntu setups with the v2 plugin, only docker compose is available and docker-compose is not, so this can fail. Consider using docker compose ... consistently or adding a small helper that prefers docker compose and falls back to docker-compose (similar to scripts/start.sh).

Copilot uses AI. Check for mistakes.
Comment thread scripts/start.sh
Comment on lines +381 to +385
fi

log "Ensuring Ollama model is available: $ollama_model"
ollama pull "$ollama_model" || true
}
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure_ollama_on_11434() runs ollama pull "$ollama_model" || true, which will silently continue even if the model pull fails (network issues, invalid model name). Since the rest of the stack is configured to use that model, this can lead to confusing runtime failures later. Consider failing the script when the pull fails, or at least logging a warning and validating the model exists via ollama list before proceeding.

Copilot uses AI. Check for mistakes.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants