Skip to content

Commit

Permalink
Concurrent processing of files (#665)
Browse files Browse the repository at this point in the history
* Update README.md

* Droped the old vector index (#652)

* added cypher_queries and llm chatbot files

* updated llm-chatbot-python

* added llm-chatbot-python

* updated llm-chatbot-python folder

* Added chatbot "hybrid " mode use case

* added the concurrent file processing

* page refresh scenario

* fixed waiting files processing issue in refresh scenario

* removed boolean param

* fixed processedCount issue

* checkbox with waiting check

* fixed the refresh scenario with processing files

* processing files check

* server side error

* processing file count check for processing files less than batch size

* processing count check to handle allselected files

* created helper functions

* code improvements

* __ changes (#656)

* DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659)

Co-authored-by: jeromechoo <hello@jeromechoo.com>

* Removed experiments/llm-chatbot-python folder from DEV branch

* redcued the password clear timeout

* Removed experiments/Cypher_Queries.ipynb file from DEV branch

* disabled the closed button on banner and connection dialog while API is in pending state

* update delete query with entities

* node id check (#663)

* Status source and type filtering  (#664)

* status source

* Name change

* type change

* rollback to previous working nvl version

* added the alert

* add BATCH_SIZE to docker

* temp fixes for 0.3.1

* alert fix for less than batch size processing

* new virtual env

* added Hybrid Chat modes (#670)

* Rename the function #657

* label and checkboxes placement changes (#675)

* label and checkboxes placement changes

* checkbox placement changes

* Graph node filename check

* env fixes with latest nvl libraries

* format fixes

* removed local files

* Remove TotalPages when save file on local (#684)

* file_name reference and verify_ssl issue fixed (#683)

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* ndl changes

* label and checkboxes placement changes (#675)

* label and checkboxes placement changes

* checkbox placement changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* Status source and type filtering  (#664)

* status source

* Name change

* type change

* added the alert

* temp fixes for 0.3.1

* label and checkboxes placement changes (#675)

* label and checkboxes placement changes

* checkbox placement changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* ndl changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* added cypher_queries and llm chatbot files

* updated llm-chatbot-python

* added llm-chatbot-python

* updated llm-chatbot-python folder

* page refresh scenario

* fixed waiting files processing issue in refresh scenario

* Removed experiments/llm-chatbot-python folder from DEV branch

* disabled the closed button on banner and connection dialog while API is in pending state

* node id check (#663)

* Status source and type filtering  (#664)

* status source

* Name change

* type change

* rollback to previous working nvl version

* added the alert

* temp fixes for 0.3.1

* label and checkboxes placement changes (#675)

* label and checkboxes placement changes

* checkbox placement changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* ndl changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* Status source and type filtering  (#664)

* status source

* Name change

* type change

* added the alert

* temp fixes for 0.3.1

* label and checkboxes placement changes (#675)

* label and checkboxes placement changes

* checkbox placement changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* ndl changes

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* env fixes with latest nvl libraries

* format fixes

* User flow changes for recreating supported vector index (#682)

* removed the if check

* Add one more check for create vector index when chunks are exist without embeddings

* removed local files

* condition changes

* chunks exists check

* chunk exists without embeddings check

* vector Index issue fixed

* vector index with different dimension

* Update graphDB_dataAccess.py

---------

Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>

* property spell fix

---------

Co-authored-by: vasanthasaikalluri <165021735+vasanthasaikalluri@users.noreply.github.com>
Co-authored-by: Jayanth T <jayanth_t@persistent.com>
Co-authored-by: abhishekkumar-27 <164544129+abhishekkumar-27@users.noreply.github.com>
Co-authored-by: Prakriti Solankey <156313631+prakriti-solankey@users.noreply.github.com>
Co-authored-by: Jerome Choo <mail@jeromechoo.com>
Co-authored-by: jeromechoo <hello@jeromechoo.com>
Co-authored-by: Pravesh Kumar <121786590+praveshkumar1988@users.noreply.github.com>
  • Loading branch information
8 people committed Aug 13, 2024
1 parent 81e64f4 commit ef46e89
Show file tree
Hide file tree
Showing 15 changed files with 418 additions and 88 deletions.
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
__pycache__/
*.py[cod]
*$py.class

.vennv
# C extensions
*.so
/backend/graph
Expand Down Expand Up @@ -170,4 +170,6 @@ google-cloud-cli-469.0.0-linux-x86_64.tar.gz
/backend/chunks
google-cloud-cli-linux-x86_64.tar.gz
.vennv
newenv
newenv
files

2 changes: 1 addition & 1 deletion backend/src/graphDB_dataAccess.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ def execute_query(self, query, param=None):

def get_current_status_document_node(self, file_name):
query = """
MATCH(d:Document {fileName : $file_name}) RETURN d.stats AS Status , d.processingTime AS processingTime,
MATCH(d:Document {fileName : $file_name}) RETURN d.status AS Status , d.processingTime AS processingTime,
d.nodeCount AS nodeCount, d.model as model, d.relationshipCount as relationshipCount,
d.total_pages AS total_pages, d.total_chunks AS total_chunks , d.fileSize as fileSize,
d.is_cancelled as is_cancelled, d.processed_chunk as processed_chunk, d.fileSource as fileSource
Expand Down
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ services:
- CHUNK_SIZE=${CHUNK_SIZE-5242880}
- ENV=${ENV-DEV}
- CHAT_MODES=${CHAT_MODES-""}
- BATCH_SIZE=${BATCH_SIZE-2}
volumes:
- ./frontend:/app
- /app/node_modules
Expand Down
6 changes: 3 additions & 3 deletions frontend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Step 1: Build the React application
FROM node:20 AS build

ARG VITE_BACKEND_API_URL="https://dev-backend-dcavk67s4a-uc.a.run.app"
ARG VITE_BACKEND_API_URL="http://localhost:8000"
ARG VITE_REACT_APP_SOURCES=""
ARG VITE_LLM_MODELS=""
ARG VITE_GOOGLE_CLIENT_ID="967196130891-vsu933h8nj6b6l6gfuk0nhh0pcagu0aa.apps.googleusercontent.com"
ARG VITE_GOOGLE_CLIENT_ID=""
ARG VITE_BLOOM_URL="https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true"
ARG VITE_TIME_PER_CHUNK=4
ARG VITE_TIME_PER_PAGE=50
ARG VITE_LARGE_FILE_SIZE=5242880
ARG VITE_CHUNK_SIZE=5242880
ARG VITE_CHAT_MODES=""
ARG VITE_ENV="DEV"
ARG VITE_BATCH_SIZE=2

WORKDIR /app
COPY package.json yarn.lock ./
RUN yarn add @neo4j-nvl/base @neo4j-nvl/react
RUN yarn install
COPY . ./
RUN VITE_BACKEND_API_URL=$VITE_BACKEND_API_URL \
Expand Down
3 changes: 2 additions & 1 deletion frontend/example.env
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ TIME_PER_PAGE=50
CHUNK_SIZE=5242880
LARGE_FILE_SIZE=5242880
GOOGLE_CLIENT_ID=""
CHAT_MODES=""
CHAT_MODES=""
BATCH_SIZE=2
Loading

0 comments on commit ef46e89

Please sign in to comment.