v0.18.0 (2024-03-29)

Feature

methodology: Add detailed explanation for Aspect-Based Sentiment Analysis and Generative language models application (49f87bd)

Fix

dependencies: Update python and hyfi versions; update thematos version (671ae1b)

Documentation

implementation: Refine methodology detail (685f3e3)
book: Un-comment implementation chapter in table of contents (c070ae7)
book: Add implementation details to the book paper (e98572c)

v0.17.0 (2023-09-09)

Feature

pipeline: Add datasets save predictions configuration (d29c1d2)
config: Add datasets-save-predictions.yaml (bb93396)

v0.16.0 (2023-09-04)

Feature

corprep-runner: Add corprep-gpt4-train_year agent (2d7e1c5)

Fix

dependencies: Upgrade hyfi-absa to 0.4.0 (5dc8ca8)

v0.15.0 (2023-09-04)

Feature

corprep-gpt3.yaml: Add newsId to configuration (623012f)
corprep: Update text parsing instructions (42d7134)
workflow: Add corprep-runner.yaml config file (d9b7ff7)
config: Add input_filename attribute to corprep-gpt3-sample.yaml, use input_filename for file path (6dd5b65)

Fix

dependencies: Upgrade hyfi to 1.32.1, lexikanon to 0.6.4 and hyfi-absa to 0.3.3 (82f72bb)

v0.14.0 (2023-08-08)

Feature

runner: Add new corprep-gpt4-train_2019.yaml configuration file. (444b8c9)
pipeline: Add dataframe_select_columns task to datasets.yaml (2f796d7)
corprep.yaml: Add datasets-filter-year to pipeline, add filter_year variable (dc215a7)
pipeline: Add datasets filter configuration (db5f422)

v0.13.0 (2023-08-07)

Feature

corprep: Add remove_columns function to dataset config (dd89120)
workflow: Add datasets-save pipeline (7a9f1d1)
pipeline: Add dataset_remove_columns to datasets.yaml (9cdb2c8)
pipeline: Add new pipeline for saving datasets (eff2232)

Fix

config: Correct dataset_path in corprep-gpt3-sample.yaml (837b9d9)
dependencies: Upgrade lexikanon to 0.5.2 (44fa3df)
dependencies: Upgrade hyfi to 1.20.1 (b95917d)

v0.12.2 (2023-08-04)

Fix

pipeline: Replace find_similar_docs_ac with find_similar_docs_by_clustering in datasets, change column names in datasets-similar.yaml (78ad328)
pipeline: Update tokenizer parameter name (f96b9b1)

v0.12.1 (2023-08-04)

Fix

tokenizer: Update stopwords path in kakao config (8158062)
pipeline: Increase sample size and adjust worker count, remove specific columns from removal list (28529da)
workflow: Enable noun and similar pipelines (f242a63)
pipeline: Update filter_dataset to filter_and_sample_data (1e97c46)
pipeline: Change num_samples to sample_size in datasets-test.yaml (c9ba5cb)
pipeline: Rename num_samples to sample_size in datasets-similar config (2cdc541)
pipeline: Change num_samples to sample_size in datasets-noun configuration (5c32059)
pipeline: Add sample filename in datasets filter config (b0609d0)

v0.12.0 (2023-08-03)

Feature

corprep: Add revision1 for more detailed sentiment task description (03edfb2)

v0.11.0 (2023-08-03)

Feature

config/runner: Add new configuration files for GPT3 and GPT4 (6d83abc)
pipeline: Add new dataset filter and load steps (441e063)
corprep: Add filter pipeline in config and comment out noun and similar pipelines (6bbd818)
filter: Add verbose print statements. (08cf03e)
corprep: Add filter_dataset configuration files (6edb66f)
corprep/datasets: Add filter functionality to datasets (3a2835a)

v0.10.0 (2023-08-03)

Feature

config/runner: Add new gpt3 and gpt4 test configuration files (be6e168)
corprep-conf-runner: Add corprep-gpt3 and corprep-gpt4 configuration files (c82df5e)
corprep: Add gpt3 and gpt4 agent configurations (b2a52cd)
corprep: Add config for TRIPLE and QUAD tasks in prompts (cd8d7e4)

v0.9.3 (2023-08-03)

Fix

corprep: Add secrets directory initialization (0bce74f)
pipeline: Replace dataframe_save with save_dataframes, remove redundant dataframe_save and save_dataframes files (666226a)
dependencies: Upgrade hyfi to 1.13.0 (41eb5e8)

v0.9.2 (2023-07-30)

Fix

corprep: Add hyabsa to plugins (6dbea3f)
dependencies: Upgrade hyfi to 1.12.5 and add hyfi-absa 0.1.0 (b0c679c)

v0.9.1 (2023-07-28)

Fix

dependencies: Upgrade hyfi to 1.12.1 (21c83f5)
pipeline: Rename dataset_load_raw to load_raw_dataset, add file_pattern and set verbose to false (74cd0c2)
corprep: Add load_raw_dataset configuration (fa0b9be)
corprep: Remove specific tasks and columns in absa_agent_predict.yaml (2debf24)
absa_agent_predict: Add null value to pipe_obj_arg_name property (99a6473)
find_similar_docs: Rename to find_similar_docs_ac.yaml (fc62043)
pipeline: Rename dataset related functions for clarity (90ab34e)
pipeline: Reduce defaults in absa-kakao and gpt35 configurations (91b4e7c)
dependencies: Upgrade lexikanon to 0.3.2, thematos to 0.2.1 (85257e4)
corprep: Add 'cluster' column to DataFrame in similar_docs functions (1903b6f)

v0.9.0 (2023-07-27)

Feature

corprep: Add thematos plugin (45ae7f3)
pyproject: Add thematos dependency (6253d99)
book: Add data.md in supplementary with filtering details (a8ed8ca)
corprep: Add dataset_to_pandas and pandas_print_head configuration files (834f55b)
similarity.py: Add multiple data-processing and plotting functions, switch clustering method from DBSCAN to Agglomerative Clustering (4f38cba)
corprep: Add yaml configurations for saving dataframes (5e5b235)
corprep: Add find_similar_docs configuration (c468597)
config: Add new pipeline and task configuration for dataset simulation (3710f93)

v0.8.0 (2023-07-26)

Feature

corprep: Add run configurations for absa_agent_predict, filter_dataset and load_raw_dataset (7290fca)
workflow: Add datasets-test configuration file (b612cc4)
pipeline/config: Enhance datasets.yaml (28a6108)
tokenize: Add extract_tokens function to handle part-of-speech tagging (c6482fb)
corprep: Add dataset_extract_nouns configuration, add dataset_extract_tokens configuration (5ddf017)
tokenizer: Add kakao configuration (9ae964f)
pipeline: Add extract tokens step with kakao tokenizer config (04e66d2)

Fix

workflow: Add workflow_name field to workflows (f9dfb5b)
dependencies: Upgrade hyfi to 1.9.4 (98e0228)

v0.7.1 (2023-07-25)

Fix

corprep: Replace package name with path (adb8932)
dependencies: Upgrade hyfi to 1.9.3 and lexikanon to 0.2.3 (1dd2766)

v0.7.0 (2023-07-24)

Feature

tokenize: Add load_from_cache_file option (f4e0057)
pipeline: Add load from cache option in tokenizer config (9617708)
corprep: Add lexikanon plugin to HyFI initialization (2fe976a)

Fix

dependencies: Upgrade hyfi to 1.9.0 and ekonlpy to 2.0.1 (f53fa18)

v0.6.0 (2023-07-23)

Feature

tests: Add tokenizer test in corprep module (5487cf7)
corprep/datasets: Add similarity.py file with similarity analysis functions (708e79f)
tokenizer: Add strip_pos option to SimpleTokenizer configuration (7fc3991)
corprep: Add tokenizer_config_name and token_col to dataset tokenize configuration (4691e71)
config/task: Add new datasets-tokenize.yaml file (c3f2af5)
pipeline: Create datasets-tokenize.yaml for tokenization in pipeline (743b8d7)
tokenizer: Add flatten option to MecabTokenizer config (070740b)
tokenizer: Add new tokenizer configurations for SimpleTokenizer, MecabTokenizer, NLTKTokenizer, add new tagger configurations for mecab and nltk (26496c1)
tokenizer: Add new tokenizer classes and methods (df773b9)
tokenizer: Add hanja table loading function (b420649)
tests: Add stopwords test in tokenizer (30335b3)
tokenizer: Add stopwords functionality (073a176)
corprep: Add new stopwords configuration (bb79233)
pyproject.toml: Add nltk dependency (cf00d4c)
corprep: Add new configuration files for text normalization (ce82466)
normalizer: Add new configurations for text normalization (ac53a45)
corprep: Add new about information (bf79ee0)
corprep/resources/dictionaries/mecab: Add new ekon_v1.dic file (d408222)
tokenizer: Add utils for text normalization and string metrics (6b1d818)
tokenizer: Add hangle encoder with normalization and decomposition functions (afe155b)
corprep/tokenizer/hanja: Add new translation functions and character handling for Hangul and Hanja (490c763)
tokenizer: Add normalizer.py with normalizer functionality (8e579aa)
dependencies: Add scikit-learn version 1.3.0 (6c34d44)

Fix

pipeline: Correct typo in tokenize step (c9158d8)
NLTKTokenizer: Modify parse method return type (48ee9cd)
corprep: Streamline main function and import statements (11533a2)
corprep: Change how HyFi is initialized and used (c2e0344)

v0.5.0 (2023-07-20)

Feature

config: Add new configuration files for absa, pipeline, task, and workflow (0883120)

Documentation

Update URL from Github pages to subdomain (f6b5f59)

v0.4.0 (2023-07-19)

Feature

corprep: Add new absa workflow configuration (c0fb068)
corprep: Add gpt35 pipeline to absa task configuration (d6c2fe8)
pipeline: Add new absa-kakao-gpt35.yaml configuration file (594c317)
corprep: Add new gpt35.yaml configuration file for ABSA task (ceb35ab)

Fix

absa/config: Handle additional exceptions in call_api function (7a53ed4)
corprep: Handle api responses and modify related functions (3400a26)
corprep/absa: Handle InvalidRequestError in call_api function (4d70fbf)
corprep/datasets: Add number of samples logging (872879c)
absa: Adjust agent call function and return structure (d10ad61)

v0.3.0 (2023-07-19)

Feature

corprep: Add absa-kakao pipeline configuration (bc1418d)
corprep: Add absa_agent_predict configuration file (62ef975)
absa/prompts: Add default.yaml configuration for TRIPLE and QUAD tasks (1c72bff)
corprep: Add new absa default configuration (68b6a58)
corprep/absa: Add config module with API logic and data models (623bc76)
corprep/absa: Add agent module with predict functionalities (1e9285b)
corprep/absa: Add new file__init__.py (32dc886)
corprep: Add dataset_load and absa configurations (f40c599)
corprep: Add absa task (653e525)
corprep/datasets/io.py: Add load_dataset function (3cb4294)
pipe: Add dataset_sample and a second dataset_save to steps (57cc25c)
corprep: Add sample_dataset function and related configuration (4b64a40)
pipeline: Add tokenize step to datasets pipeline (1536430)
corprep: Add new file for tokenizing dataset (6fb827e)
corprep/datasets/preprocessing: Add tokenize_dataset function (2db1824)

Fix

.tasks.toml: Lower coverage fail threshold to 1% (b2b7718)
corprep: Introduce setLogger for HyFI (c78e6e2)
datasets: Add path and file_pattern parameters to load_raw_dataset (2bf977b)
dependencies: Upgrade hyfi to 1.2.14 (ca9ac2b)

v0.2.0 (2023-07-17)

Feature

corprep: Add save_dataset pipe (b7a3dff)
corprep: Add save_raw_dataset configuration (7e331d2)
corprep: Add datasets.yaml configuration for pipeline (4cf7bd4)
corprep: Add new project configuration file (be8890d)
corprep: Add new configuration file (4cdf806)
datasets: Add function to save raw datasets (8a5319c)
datasets: Create corprep/datasets/init.py file (5808b35)

Fix

corprep: Add global_workspace_name to yaml config file (ffe0c12)
dependencies: Upgrade hyfi to 1.2.13 (d1b7658)
dependencies: Upgrade hyfi to 1.2.10 (68314de)
datasets: Add Dataset import to raw.py (c959747)
dependencies: Upgrade hyfi to 1.2.7 (82ab4e6)
dependencies: Upgrade hyfi to 1.2.6 (cfb2f3f)

Documentation

book: Add new sections (introduction, literature, methodology, results, conclusion, supplementary materials) (e1f0aaf)

v0.1.2 (2023-07-11)

Fix

dependencies: Upgrade hyfi to 1.2.2 (dfdc822)

v0.1.1 (2023-06-28)

Fix

dependencies: Upgrade hyfi to 0.15.0 (cc9463b)

v0.1.0 (2023-06-07)

Feature

Initial version (6b5cee9)

Fix

Initial version (1a33102)

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

v0.18.0 (2024-03-29)

Feature

Fix

Documentation

v0.17.0 (2023-09-09)

Feature

v0.16.0 (2023-09-04)

Feature

Fix

v0.15.0 (2023-09-04)

Feature

Fix

v0.14.0 (2023-08-08)

Feature

v0.13.0 (2023-08-07)

Feature

Fix

v0.12.2 (2023-08-04)

Fix

v0.12.1 (2023-08-04)

Fix

v0.12.0 (2023-08-03)

Feature

v0.11.0 (2023-08-03)

Feature

v0.10.0 (2023-08-03)

Feature

v0.9.3 (2023-08-03)

Fix

v0.9.2 (2023-07-30)

Fix

v0.9.1 (2023-07-28)

Fix

v0.9.0 (2023-07-27)

Feature

v0.8.0 (2023-07-26)

Feature

Fix

v0.7.1 (2023-07-25)

Fix

v0.7.0 (2023-07-24)

Feature

Fix

v0.6.0 (2023-07-23)

Feature

Fix

v0.5.0 (2023-07-20)

Feature

Documentation

v0.4.0 (2023-07-19)

Feature

Fix

v0.3.0 (2023-07-19)

Feature

Fix

v0.2.0 (2023-07-17)

Feature

Fix

Documentation

v0.1.2 (2023-07-11)

Fix

v0.1.1 (2023-06-28)

Fix

v0.1.0 (2023-06-07)

Feature

Fix