# Example usage

To use `addax` in a project:

In [78]:
import addax

print(addax.__version__)

0.1.0


___
### Manual Example
Here we will demonstrate how to use `addax` functions to clean review data, then compute a sentiment score and label.

Import the Library:

In [79]:
from addax.addax import (
    analyze_sentiment_dataframe,
    read_csv,
    standardize_headers,
    standardize_target_col_data,
)

Read the csv:

In [80]:
df = read_csv("https://github.com/dtavizondykstra/addax/blob/bb57453fd311dc75fd1c64cceb6606cfab2c7c09/docs/example_data.csv?raw=true")
df.head()

2025-05-28 20:35:02,380 - addax.addax - INFO - Successfully read CSV: https://github.com/dtavizondykstra/addax/blob/bb57453fd311dc75fd1c64cceb6606cfab2c7c09/docs/example_data.csv?raw=true


Unnamed: 0,id,user,star rating,comment,reviewed on
0,0,,4,No issues.,7/23/14
1,1,0mie,5,"Purchased this for my device, it worked as adv...",10/25/13
2,2,1K3,4,it works as expected. I should have sprung for...,12/23/12
3,3,1m2,5,This think has worked out great.Had a diff. br...,11/21/13
4,4,2&amp;1/2Men,5,"Bought it with Retail Packaging, arrived legit...",7/13/13


Set a constant for the target column:

In [81]:
TARGET_COLUMN = "comment" # would recommend placing this at the top of the file under imports;
# but for the sake of keeping a logical flow in this example, we will place it here

Standardize DF Headers (lowercase, replace spaces with underscores, remove special characters):

In [82]:
df = standardize_headers(df=df)
df.head()

2025-05-28 20:35:02,392 - addax.addax - INFO - Standardized headers to lowercase and underscores.


Unnamed: 0,id,user,star_rating,comment,reviewed_on
0,0,,4,No issues.,7/23/14
1,1,0mie,5,"Purchased this for my device, it worked as adv...",10/25/13
2,2,1K3,4,it works as expected. I should have sprung for...,12/23/12
3,3,1m2,5,This think has worked out great.Had a diff. br...,11/21/13
4,4,2&amp;1/2Men,5,"Bought it with Retail Packaging, arrived legit...",7/13/13


Standardize the text in the target column (lowercase, remove special characters, remove rows with missing or empty values):

In [83]:
df = standardize_target_col_data(df=df, target_column=TARGET_COLUMN)
df.head()

2025-05-28 20:35:02,400 - addax.addax - INFO - Formatting text data in target column 'comment'.
2025-05-28 20:35:02,404 - addax.addax - INFO - Removed 1 rows with missing or empty 'comment'.
2025-05-28 20:35:02,404 - addax.addax - INFO - Formatted text in column 'comment': lowercased and removed special characters.
2025-05-28 20:35:02,500 - addax.addax - INFO - Removed 0 rows with missing or empty 'comment'.
2025-05-28 20:35:02,500 - addax.addax - INFO - Formatted text in column 'comment': lowercased and removed special characters.


Unnamed: 0,id,user,star_rating,comment,reviewed_on
0,0,,4,no issues,7/23/14
1,1,0mie,5,purchased this for my device it worked as adve...,10/25/13
2,2,1K3,4,it works as expected i should have sprung for ...,12/23/12
3,3,1m2,5,this think has worked out greathad a diff bran...,11/21/13
4,4,2&amp;1/2Men,5,bought it with retail packaging arrived legit ...,7/13/13


Apply the sentiment analysis to the target column:

In [84]:
df = analyze_sentiment_dataframe(df=df, target_column=TARGET_COLUMN, include_subjectivity=True, label=True)
df.head()

2025-05-28 20:35:04,501 - addax.addax - INFO - Analyzed sentiment for 4914 rows in 'comment'.


Unnamed: 0,id,user,star_rating,comment,reviewed_on,polarity,subjectivity,polarity_label,subjectivity_label
0,0,,4,no issues,7/23/14,0.0,0.0,neutral,objective
1,1,0mie,5,purchased this for my device it worked as adve...,10/25/13,0.2,0.2,positive,objective
2,2,1K3,4,it works as expected i should have sprung for ...,12/23/12,0.129167,0.525,positive,subjective
3,3,1m2,5,this think has worked out greathad a diff bran...,11/21/13,0.025,0.55,neutral,subjective
4,4,2&amp;1/2Men,5,bought it with retail packaging arrived legit ...,7/13/13,0.386667,0.36,positive,objective


___
### Pipeline Usage
Here we will demonstrate how to use the `addax` `process_sentiment` pipeline function to clean review data and then compute a sentiment score and label, all in one easy step.

Import the library:

In [85]:
from addax.addax import (
    process_sentiment,
    read_csv,
)

Import the data:

In [86]:
df = read_csv("https://github.com/dtavizondykstra/addax/blob/bb57453fd311dc75fd1c64cceb6606cfab2c7c09/docs/example_data.csv?raw=true")
df.head()

2025-05-28 20:35:05,234 - addax.addax - INFO - Successfully read CSV: https://github.com/dtavizondykstra/addax/blob/bb57453fd311dc75fd1c64cceb6606cfab2c7c09/docs/example_data.csv?raw=true


Unnamed: 0,id,user,star rating,comment,reviewed on
0,0,,4,No issues.,7/23/14
1,1,0mie,5,"Purchased this for my device, it worked as adv...",10/25/13
2,2,1K3,4,it works as expected. I should have sprung for...,12/23/12
3,3,1m2,5,This think has worked out great.Had a diff. br...,11/21/13
4,4,2&amp;1/2Men,5,"Bought it with Retail Packaging, arrived legit...",7/13/13


Set a constant for the target column:

In [87]:
TARGET_COLUMN = "comment" # would recommend placing this at the top of the file under imports;
# but for the sake of keeping a logical flow in this example, we will place it here

Run the sentiment analysis pipeline in one step:

In [88]:
process_sentiment(df=df, target_column=TARGET_COLUMN, include_subjectivity=True, label=True)

2025-05-28 20:35:05,248 - addax.addax - INFO - Standardized headers to lowercase and underscores.
2025-05-28 20:35:05,249 - addax.addax - INFO - Processing text column 'comment' for sentiment analysis.
2025-05-28 20:35:05,249 - addax.addax - INFO - Formatting text data in target column 'comment'.
2025-05-28 20:35:05,252 - addax.addax - INFO - Removed 1 rows with missing or empty 'comment'.
2025-05-28 20:35:05,253 - addax.addax - INFO - Formatted text in column 'comment': lowercased and removed special characters.
2025-05-28 20:35:05,340 - addax.addax - INFO - Removed 0 rows with missing or empty 'comment'.
2025-05-28 20:35:05,340 - addax.addax - INFO - Formatted text in column 'comment': lowercased and removed special characters.
2025-05-28 20:35:07,350 - addax.addax - INFO - Analyzed sentiment for 4914 rows in 'comment'.


Unnamed: 0,id,user,star_rating,comment,reviewed_on,polarity,subjectivity,polarity_label,subjectivity_label
0,0,,4,no issues,7/23/14,0.000000,0.000000,neutral,objective
1,1,0mie,5,purchased this for my device it worked as adve...,10/25/13,0.200000,0.200000,positive,objective
2,2,1K3,4,it works as expected i should have sprung for ...,12/23/12,0.129167,0.525000,positive,subjective
3,3,1m2,5,this think has worked out greathad a diff bran...,11/21/13,0.025000,0.550000,neutral,subjective
4,4,2&amp;1/2Men,5,bought it with retail packaging arrived legit ...,7/13/13,0.386667,0.360000,positive,objective
...,...,...,...,...,...,...,...,...,...
4910,4910,"ZM ""J""",1,i bought this sandisk gb class to use with my...,7/23/13,0.200000,0.271667,positive,objective
4911,4911,Zo,5,used this for extending the capabilities of my...,8/22/13,0.800000,0.750000,positive,subjective
4912,4912,Z S Liske,5,great card that is very fast and reliable it c...,3/31/14,0.280833,0.609167,positive,subjective
4913,4913,Z Taylor,5,good amount of space for the stuff i want to d...,9/16/13,0.600000,0.550000,positive,subjective
