Skip to content

v0.0.4#5

Merged
ahamptonTIA merged 5 commits intomainfrom
20240320
Mar 22, 2024
Merged

v0.0.4#5
ahamptonTIA merged 5 commits intomainfrom
20240320

Conversation

@ahamptonTIA
Copy link
Copy Markdown
Collaborator

@ahamptonTIA ahamptonTIA commented Mar 22, 2024

Added the detect_file_encoding() function to address issue #4 UnicodeDecodeError.
The function detects the character encoding of a text-based file using chardet library to determine the appropriate encoding when reading csv files that may not explicitly declare their encoding. It analyzes a sample of the file's content to identify the most likely character encoding scheme used.

ahamptonTIA added 5 commits March 20, 2024 14:31
This code incorporates a new function, detect_file_encoding, to automatically detect the character encoding of each file. This ensures proper handling of files with various encodings. Subsequently, the Pandas DataFrames created from these files are written out using UTF-8 encoding for consistent representation.
Removed encoding='utf-8' in pd.to_excel which is no longer valid in in pandas 1.1.0 and above
@ahamptonTIA ahamptonTIA merged commit bb2524a into main Mar 22, 2024
@ahamptonTIA ahamptonTIA deleted the 20240320 branch March 22, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant