You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use Python to preprocess a file and then load it into the database. More specifically, I have defined the following import flow:
1. Users specify the location of the input file
2. The code preprocesses the file
3. The code requests the import directory and moves it to the import folder of the database
4. The file is imported using apoc.load.csv
5. Once the import is done, I tidy up by deleting the file from the import folder.
However, after recently upgrading my database from version 5.9.0 to 5.10.0, I'm not allowed to delete the file from the import folder, as it is still being consumed by the database. The error still prevails in version 5.17 (last tested).
Expected Behavior (Mandatory)
After a file imported using apoc.load.csv, the file should be closed on consumption end, so that other processes can access the file.
Actual Behavior (Mandatory)
The issue arises when I attempt to delete the file post-import. I encounter a PermissionError, signaling that the file is still in use by another process. It seems the database is holding onto the file longer than anticipated, causing a conflict with my cleanup operation.
More specifically, I get this error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: '<ne04j>\\import\\import_file.csv
How to Reproduce the Problem
Simple Dataset (where it's possibile)
The specific dataset does not matter, it happens with any dataset I try to import.
Python Code
This is the Python code I use to import and delete the file.
# Import file
def import_file(tx):
result = tx.run('''CALL apoc.periodic.iterate('
CALL apoc.load.csv("import_file.csv") yield map as row return row',
'CREATE (record:Record)
SET record += row'
, {batchSize:10000, parallel:true, retries: 1});''')
with self.driver.session(database="neo4j") as session:
session.execute_write(import_file)
# Delete the file from the import directory
path = Path(self.get_import_directory(), "import_file.csv")
os.remove(path)
Steps (Mandatory)
Import data using apoc.load.csv with the Python neo4j driver
Delete the file directly afterwards using Python
Specifications (Mandatory)
Currently used versions
Versions
OS: Windows 11 Enterprise
Neo4j: v5.10
Neo4j-Apoc (and extended): v5.10
Neo4j driver: v5.10
Python v3.11
The text was updated successfully, but these errors were encountered:
The error seems to occur also without apoc.periodic.iterate, and even running the apoc.load.csv directly on neo4j browser/desktop, without using python code trying to delete the file via File Explorer.
It could probably be an error in neo4j itself, as the code regarding apoc.load.csv has not changed.
I opened an issue on the neo4j kernel repository, to investigate both sides: neo4j/neo4j#13480
I use Python to preprocess a file and then load it into the database. More specifically, I have defined the following import flow:
1. Users specify the location of the input file
2. The code preprocesses the file
3. The code requests the import directory and moves it to the import folder of the database
4. The file is imported using
apoc.load.csv
5. Once the import is done, I tidy up by deleting the file from the import folder.
However, after recently upgrading my database from version 5.9.0 to 5.10.0, I'm not allowed to delete the file from the import folder, as it is still being consumed by the database. The error still prevails in version 5.17 (last tested).
Expected Behavior (Mandatory)
After a file imported using
apoc.load.csv
, the file should be closed on consumption end, so that other processes can access the file.Actual Behavior (Mandatory)
The issue arises when I attempt to delete the file post-import. I encounter a PermissionError, signaling that the file is still in use by another process. It seems the database is holding onto the file longer than anticipated, causing a conflict with my cleanup operation.
More specifically, I get this error:
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: '<ne04j>\\import\\import_file.csv
How to Reproduce the Problem
Simple Dataset (where it's possibile)
The specific dataset does not matter, it happens with any dataset I try to import.
Python Code
This is the Python code I use to import and delete the file.
Steps (Mandatory)
apoc.load.csv
with the Python neo4j driverSpecifications (Mandatory)
Currently used versions
Versions
The text was updated successfully, but these errors were encountered: