-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Python update (3.13) #2149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Python update (3.13) #2149
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Set minimum Python version to 3.11 (removed 3.10 support) - Added support for Python 3.14 - Updated CI workflows: single-version jobs use 3.14, matrix jobs use 3.11 and 3.14 - Fixed license format to use SPDX-compatible format for Python 3.14 - Updated pyarrow to >=22.0.0 for Python 3.14 wheel support - Added explicit fastuuid~=0.14 and blis~=1.3 for Python 3.14 compatibility - Replaced all loose version constraints (>=) with compatible release (~=) for better lock file control - Applied stricter versioning to all packages: graphrag, graphrag-common, graphrag-storage, unified-search-app
Numpy 1.25.x has access violation issues on Python 3.14 Windows. Numpy 2.x has proper Python 3.14 support including Windows wheels.
Pandas 2.2.x was compiled against numpy 1.x and causes ABI incompatibility errors with numpy 2.x. Pandas 2.3.0+ supports numpy 2.x properly.
Scipy versions < 1.15.0 have C extensions built against numpy 1.x and are incompatible with numpy 2.x, causing dtype size errors.
- Set Python version range to 3.11-3.13 (removed 3.14 support) - Updated CI workflows: single-version jobs use 3.13, matrix jobs use 3.11 and 3.13 - Dependencies optimized for Python 3.13 compatibility: - pyarrow~=22.0 (has Python 3.13 wheels) - numpy~=1.26 - pandas~=2.2 - blis~=1.0 - fastuuid~=0.13 - Applied stricter version constraints using ~= operator throughout - Updated uv.lock with resolved dependencies
…tibility Numpy 1.26.x causes access violations on Python 3.13 Windows. Numpy 2.1+ has proper Python 3.13 support with Windows wheels. Pandas 2.3+ is required for numpy 2.x compatibility.
dworthen
added a commit
that referenced
this pull request
Jan 27, 2026
* Remove graph embedding and UMAP (#2048) * Remove umap/layout operation * Remove graph embedding * Bump unified-search to GR 2.5.0 * Remove graph vis from unified-search * Remove file filtering (#2050) * Remove document filtering * Semver * Fix integ tests * Fix file find tuple * Fix another dangling find tuple * Remove text unit grouping (#2052) * Remove text unit group_by_columns * Semver * Fix default token split test * Fix models in config test samples * Fix token length in context sort test * Fix document sort * Re-implement hierarchical Leiden (#2049) * Use graspologic-native hierarchical leiden * Re-implement largest_connected_component * Copy in modularity * Use graspologic-native directly in pyproject * Remove directed graph tests (we don't use this) * Semver * Remove graspologic dep * Use 4.1 and text-embedding-3-large as defaults * Update comment * Clean vector store (#2077) * clean vector store code * fix * fix launch.json --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Update v3/main missing config + functions (#2082) * reduce schema fields (#2089) * reduce schema fields * fix launch.json --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Remove strategy dicts (#2090) * Remove "strategy" from community reports config/workflow * Remove extraction strategy from extract_graph * Remove summarization strategy from extract_graph * Remove strategy from claim extraction * Strongly type prompt templates * Remove strategy from embed_text * Push hydrated params into community report workflows * Push hyrdated params into extract covariates * Push hydrated params into extract graph NLP * Push hydrated params into extract graph * Push hydrated params into text embeddings * Remove a few more low-level defaults * Semver * Remove configurable prompt delimiters * Update smoke tests * Remove fnllm (#2095) * Sort deps alpha * Remove multi search (#2093) * Remove multi-search from CLI * Remove multi-search from API * Flatten vector_store config * Push hydrated vector store down to embed_text * Remove outputs from config * Remove multi-search notebook/docs * Add missing response_type in basic search API * Fix basic search context and id mapping * Fix v1 migration notebook * Fix query entity search tests * V3 docs and cleanup (#2100) * Remove community contrib notebooks * Add migration notebook and breaking changes page edits * Update/polish docs * Make model instance name configurable * Add vector schema updates to v3 migration notebook * Spellcheck * Bump smoke test runtimes * Remove document overwrite (#2101) * remove document overwrite from vector store configuration * remove document overwrite and refactor load documents method * fix test * fix test * fix test --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Unified factory (#2105) * Simplify Factory interface * Migrate CacheFactory to standard base class * Migrate LoggerFactory to standard base class * Migrate StorageFactory to standard base class * Migrate VectorStoreFactory to standard base class * Update vector store example notebook * Delete notebook outputs * Move default providers into factories * Move retry/limit tests into integ * Split language model factories * Set smoke test tpm/rpm * Fix factory integ tests * Add method to smoke test, switch text to 'fast' * Fix text smoke config for fast workflow * Add new workflows to text smoke test * Convert input readers to a proper factory * Remove covariates from fast smoke test * Update docs for input factory * Bump smoke runtime * Even longer runtime * min-csv timeout * Remove unnecessary lambdas * Prefix vector store (#2106) * add prefix to vector store configuration and removal of container name * docs updated * change prefix property name * change prefix property name * feedback implemented --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * fix for container name * Restructure project as monorepo. (#2111) * Restructure project as monorepo. * Fix formatting * Storage fixes and cleanup (#2118) * Fix pipeline recursion * Remove base_dir from storage.find * Remove max_count from storage.find * Remove prefix on storage integ test * Add base_dir in creation_date test * Wrap base_dir in Path * Use constants for input/update directories * Nov 2025 housekeeping (#2120) * Remove gensim sideload * Split CI build/type checks from unit tests * Thorough review of docs to align with v3 * Format * Fix version * Fix type * Graphrag config (#2119) * Add load_config to graphrag-common package. * Empty graph guards (#2126) * Remove networkx from graph_extractor and clean out redundancy * Bubble pipeline error to console * Remove embeddings optional new (#2128) * remove optional embeddings * fix test * fix tests * fix pipeline * fix test * fix test * fix test * fix tests --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Format * Add empty checks for NLP graphs (#2133) * Init command asks for models (#2137) * Add init prompting for models * Remove hard-coded model config validation * Switch to typer option prompt for full CLI use with models * Update getting started for init model input * Bump request timeout and overall smoke test timeout * Add graphrag-storage. (#2127) * Add graphrag-storage. * Python update (3.13) (#2149) * Update to python 3.14 as default, with range down to 3.10 * Fix enum value in query cli * Update pyarrow * Update py version for storage package * Remove 3.10 * add fastuuid * Update Python support to 3.11-3.14 with stricter dependency constraints - Set minimum Python version to 3.11 (removed 3.10 support) - Added support for Python 3.14 - Updated CI workflows: single-version jobs use 3.14, matrix jobs use 3.11 and 3.14 - Fixed license format to use SPDX-compatible format for Python 3.14 - Updated pyarrow to >=22.0.0 for Python 3.14 wheel support - Added explicit fastuuid~=0.14 and blis~=1.3 for Python 3.14 compatibility - Replaced all loose version constraints (>=) with compatible release (~=) for better lock file control - Applied stricter versioning to all packages: graphrag, graphrag-common, graphrag-storage, unified-search-app * update uv lock * Pin blis to ~=1.3.3 to ensure Python 3.14 wheel availability * Update uv lock * Update numpy to >=2.0.0 for Python 3.14 Windows compatibility Numpy 1.25.x has access violation issues on Python 3.14 Windows. Numpy 2.x has proper Python 3.14 support including Windows wheels. * update uv lock * Update pandas to >=2.3.0 for numpy 2.x compatibility Pandas 2.2.x was compiled against numpy 1.x and causes ABI incompatibility errors with numpy 2.x. Pandas 2.3.0+ supports numpy 2.x properly. * update uv.lock * Add scipy>=1.15.0 for numpy 2.x compatibility Scipy versions < 1.15.0 have C extensions built against numpy 1.x and are incompatible with numpy 2.x, causing dtype size errors. * update uv lock * Update Python support to 3.11-3.13 with compatible dependencies - Set Python version range to 3.11-3.13 (removed 3.14 support) - Updated CI workflows: single-version jobs use 3.13, matrix jobs use 3.11 and 3.13 - Dependencies optimized for Python 3.13 compatibility: - pyarrow~=22.0 (has Python 3.13 wheels) - numpy~=1.26 - pandas~=2.2 - blis~=1.0 - fastuuid~=0.13 - Applied stricter version constraints using ~= operator throughout - Updated uv.lock with resolved dependencies * Update numpy to 2.1+ and pandas to 2.3+ for Python 3.13 Windows compatibility Numpy 1.26.x causes access violations on Python 3.13 Windows. Numpy 2.1+ has proper Python 3.13 support with Windows wheels. Pandas 2.3+ is required for numpy 2.x compatibility. * update vsts.yml python version * Add GraphRAG Cache package. (#2153) * Add GraphRAG Cache package. * Fix a bunch of module comments and function visibility (#2154) * Issue #2004 fix (#2159) * fix issue #2004 using KeenhoChu idea in his PR * add unit test for dynamic community selection * add unit test for dynamic community selection implementing #2158 logic --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Mismatch between header in community report generation prompt examples and input data (id vs human_readable_id) (#2161) * fix issue #860 for mismatch in prompts and input * fix format --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Chunker factory (#2156) * Delete NoopTextSplitter * Delete unused check_token_limit * Add base chunking factory and migrate workflow to use it * Split apart chunker module * Co-locate chunking/splitting * Collapse token splitting functionality into one class/function * Restore create_base_text_units parameterization * Move Tokenizer base class to common package * Move pre-pending into chunkers * Streamline config * Fix defaults construction * Add prepending tests * Remove chunk_size_includes_metadata config * Revert ChunkingDocument interface * Move metadata prepending to a util * Move Tokenizer back to GR core * Fix tokenizer removal from chunker * Set defaults for chunking config * Move chunking to monorepo package * Format * Typo * Add ChunkResult model * Streamline chunking config * Add missing version updates for graphrag_chunking * Input factory (#2168) * Update input factory to match other factories * Move input config alongside input readers * Move file pattern logic into InputReader * Set encoding default * Clean up optional column configs * Combine structured data extraction * Remove pandas from input loading * Throw if empty documents * Add json lines (jsonl) input support * Store raw data * Fix merge imports * Move metadata handling entirely to chunking * Nicer automatic title * Typo * Add get_property utility for nested dictionary access with dot notation * Update structured_file_reader to use get_property utility * Extract input module into new graphrag-input monorepo package - Create new graphrag-input package with input loading utilities - Move InputConfig, InputFileType, InputReader, TextDocument, and file readers (CSV, JSON, JSONL, Text) - Add get_property utility for nested dictionary access with dot notation - Include hashing utility for document ID generation - Update all imports throughout codebase to use graphrag_input - Add package to workspace configuration and release tasks - Remove old graphrag.index.input module * Rename ChunkResult to TextChunk and add transformer support - Rename chunk_result.py to text_chunk.py with ChunkResult -> TextChunk - Add 'original' field to TextChunk to track pre-transform text - Add optional transform callback to chunker.chunk() method - Add add_metadata transformer for prepending metadata to chunks - Update create_chunk_results to apply transforms and populate original - Update sentence_chunker and token_chunker with transform support - Refactor create_base_text_units to use new transformer pattern - Rename pluck_metadata to get/collect methods on TextDocument * Back-compat comment * Align input config type name with other factory configs * Add MarkItDown support * Remove pattern default from MarkItDown reader * Remove plugins flag (implicit disabled) * Format * Update verb tests * Separate storage from input config * Add empty objects for NaN raw_data * Fix smoke tests * Fix BOM in csv smoke * Format * DRIFT fixes (#2171) * Use stable ids for community reports * Remove deprecated title from embedding flow * Remove embedding column from df loaders * Fix lancedb insertion * Add drift back to smoke tests * Fix mock embedder to match default embedding length * Fix DRIFT notebook * Push drift_k_followups through to prompt * Format * Vector package (#2172) * Extract graphrag-vectors package * Simplify vector factory usage and config defaults * Update factory integ initializers * Fix mock patch * Format * Register vector stores in tests * Set a default vector store name * Update vector readme * Remove impls from init * Move some validation into impls * Remove index_prefix * Move duplicate method to base class * Fix smoke vector config * Update index bug (#2173) * fix update index bug * blob storage bug fix --------- Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Add GraphRAG LLM package. (#2174) * Update documentation for v3 release (#2176) update documentation for v3 release Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> * Graphrag llm cleanup (#2181) * Migration update (#2180) * fix formatting. --------- Co-authored-by: Nathan Evans <github@talkswithnumbers.com> Co-authored-by: gaudyb <85708998+gaudyb@users.noreply.github.com> Co-authored-by: Gaudy Blanco <gaudy-microsoft@MacBook-Pro-m4-Gaudy-For-Work.local> Co-authored-by: Andres Morales <86074752+andresmor-ms@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rolls py version to 3.14. CI still has a built with 3.10 to keep it lightly tested until it falls out of regular use.
Edit: settled on 3.13 due to upstream dependency readiness such as spacy). Will revisit 3.14 once it is further into its lifecycle.