You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every 2048th inserted date is incorrect. The incorrect date is seemingly random.
The problem does not seem to ocurr when values are inserted in large batches (5000+), but does in smaller batches (~2000)
The problem seems to only ocurr after running the program twice, it's not clear to me whether it only appears after the connection to the file has been closed, or whether it's just a question of the number of queries
The problem does not seem to appear when running a generic query that would show the entire table, but does when running a query for that specific line
To Reproduce
Run the following code to generate and add a series of consective dates associated with IDs to a database:
importpandasaspdimportdatetimeimportduckdbdbFile="testDB.db"defInitializeDB(db):
db.sql("CREATE SEQUENCE IF NOT EXISTS seq_dateID START 1")
db.sql("""CREATE TABLE IF NOT EXISTS Dates ( dateID LONG NOT NULL DEFAULT NEXTVAL('seq_dateID'), day DATE NOT NULL, PRIMARY KEY (dateID) )""")
defmain():
db=duckdb.connect(dbFile)
InitializeDB(db)
foriinrange(2):
datesToAdd=pd.date_range(start='2010-1-1', end=datetime.date.today(), freq='D').to_frame(name="date")
datesToAdd["date"] =pd.to_datetime(datesToAdd["date"]).dt.datedatesToAdd=datesToAdd.head(2000)
db.sql("INSERT INTO Dates (dateID, day) SELECT nextval('seq_dateID'), date FROM datesToAdd")
outputToCheck=db.sql("SELECT * FROM Dates").df()
print(outputToCheck.iloc[2047:2050])
outputToCheck2=db.sql("SELECT * FROM Dates WHERE dateID = 2049").df()
print(outputToCheck2)
passif__name__=="__main__":
main()
Upon running it the second time it should produce the following text:
dateID day
2047 2048 2010-02-17
2048 2049 2010-02-18
2049 2050 2010-02-19
dateID day
0 2049 2005-07-16
Which shows that the date at index 2048, with a dateID of 2049 is not consecutive, but the generic query above returns a different date that is consective
OS:
Windows 10 Home x64
DuckDB Version:
0.9.2
DuckDB Client:
Python
Full Name:
Orion Gesslein
Affiliation:
None/Myself
Have you tried this on the latest main branch?
I have tested with a main build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
Yes, I have
The text was updated successfully, but these errors were encountered:
What happens?
Every 2048th inserted date is incorrect. The incorrect date is seemingly random.
The problem does not seem to ocurr when values are inserted in large batches (5000+), but does in smaller batches (~2000)
The problem seems to only ocurr after running the program twice, it's not clear to me whether it only appears after the connection to the file has been closed, or whether it's just a question of the number of queries
The problem does not seem to appear when running a generic query that would show the entire table, but does when running a query for that specific line
To Reproduce
Run the following code to generate and add a series of consective dates associated with IDs to a database:
Upon running it the second time it should produce the following text:
Which shows that the date at index 2048, with a dateID of 2049 is not consecutive, but the generic query above returns a different date that is consective
OS:
Windows 10 Home x64
DuckDB Version:
0.9.2
DuckDB Client:
Python
Full Name:
Orion Gesslein
Affiliation:
None/Myself
Have you tried this on the latest
main
branch?I have tested with a main build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
The text was updated successfully, but these errors were encountered: