# Semantic Content Classifier (SCC)
In this Jupyter Notebook, we delve into practical applications of the SCA_utils.py library, which stands at the forefront of Semantic Content Analysis (SCA).



## Introduction

### Libraries

In [1]:
import sys
sys.path.append("../source")  # Add the directory 'source' to sys.path

In [2]:
from sca_utils import TextClassifier




## Using SCA_utils methods:

In [3]:
## Instantiating the SCA Text Classifier:
classifier = TextClassifier(model_multiclass_path='../models/model_02_E.h5',
                            encoder_multiclass_path='../models/encoder_oneHot_E.pickle',
                            model_regression_path='../models/model_01_D2.h5')



















### Getting the embedded vector for a given word:

In [4]:
## Experimenting with an existent word
word_text, word_vector = classifier.nlp_getVector(word='None')
print(word_text)
print(f'{word_vector[:2]}..')


None
[-0.0562946  -0.00365345]..


In [5]:
## Trying a none word to check how exceptions are being raised:
classifier.nlp_getVector(word=None)

Error: word vector not available for None due to:
[E1041] Expected a string, Doc, or bytes as input, but got: <class 'NoneType'>


### Finding similar words:

In [6]:
## Replace keyword for the word of interest:
keyword = "technology"

similar_words = classifier.similarity_findWords(keyword, n=5)

print(f'--- Similar words for: "{keyword}":')
for word, similarity in similar_words:
    print(f'"{word}", similarity of {similarity:.2f}%')

--- Similar words for: "technology":
"technologic", similarity of 0.95%
"technologies", similarity of 0.94%
"technologie", similarity of 0.93%
"technological", similarity of 0.90%
"technoscience", similarity of 0.89%


### Evaluating objectivity and subjectivity in long texts:

In [13]:
input_objective = '''
Geodimensioning, a critical component within the geospatial field, entails the measurement and quantification of Earth's dimensions and geographical features. This intricate process involves various metrics, each serving a unique purpose in enhancing the accuracy and reliability of geospatial data. Scale, a fundamental metric, denotes the ratio of distance on a map to the corresponding distance on the ground. It's pivotal in ensuring that geospatial representations are proportionally accurate to real-world dimensions. Scale accuracy is paramount, as even minor discrepancies can lead to significant errors in distance and area measurements, affecting planning and decision-making processes.
Precision refers to the level of detail and exactness in the measurement of geospatial data. High precision is essential in applications requiring fine-grained data, such as urban planning and cadastral mapping. Precision is often limited by the measurement tools and technology employed, necessitating ongoing advancements in geospatial instrumentation and methodologies.
Resolution pertains to the smallest detectable feature within geospatial data. In digital mapping and satellite imagery, resolution is a key determinant of the quality and usability of geospatial information. High-resolution data captures minute details, crucial for applications like environmental monitoring and infrastructure development.
Accuracy, distinct from precision, measures the closeness of a given measurement to its true value. In geodimensioning, accuracy is affected by factors such as instrument calibration, measurement techniques, and environmental conditions. Ensuring high accuracy is imperative for reliable geospatial analysis and modeling.
Georeferencing accuracy is another vital metric, ensuring that geospatial data aligns correctly with real-world coordinates. This is critical for integrating and comparing data from diverse sources, enabling accurate mapping and analysis across multiple scales and geographies.
Datum and projection consistency are essential for maintaining the integrity of geospatial data. Datum refers to the reference frame used for measuring locations on Earth's surface, while projection involves the method of translating three-dimensional Earth onto a two-dimensional map. Consistency in these metrics ensures that geospatial data from different sources can be accurately integrated and compared.
In conclusion, the metrics of geodimensioning play a crucial role in the fidelity and utility of geospatial data. As technology advances, continuous refinement of these metrics is essential to meet the growing demands of diverse applications in fields such as environmental science, urban planning, and global navigation systems.
'''

In [15]:
input_subjective = '''
In the realm of Earth's vast embrace, where landscapes whisper ancient tales, geodimensioning becomes an art, a poetic dance with space and time. It's here, amidst the cartographer's dream, that maps unfurl like canvases, painting the world in strokes of truth and mystery combined.

Scale, the cartographer's muse, weaves tales of distances vast and small, where a single line can span mountains, and a dot encloses cities' sprawl. It's a delicate balance, a harmony sought, between the expanse of the world and the parchment's thought.

Precision, the sculptor of detail, carves the world with a fine-edged tool, where every measurement is a verse, and every data point a jewel. It's the pursuit of perfection, a never-ending quest, to capture the Earth's essence, at its very best.

Resolution, the seer's gift, reveals the world in grains of sand, where hidden stories come to light, and nothing is too small to stand. It's the clarity of vision, the ability to see, the intricate patterns of nature, and the world's underlying symmetry.

Accuracy, the anchor of truth, holds firm in the shifting tides, where measurements speak of reality, and the truth no longer hides. It's the cornerstone of trust, in the maps we chart, guiding explorers and dreamers, as they embark.

Georeferencing, the dancer's grace, aligns the stars with the earthly plane, where each point is a heartbeat, in the universe's refrain. It's the cosmic connection, the thread that binds, the world's vast wonders, in the maps we find.

In the whisper of leaves, and the rush of streams, in the shadows of mountains, and the sunlight's beams, geodimensioning captures the world, in lines and numbers, yet tells a story, of a universe unfurled.

So let us wander, with maps in hand, through this poetic landscape, vast and grand, where every measurement, and every line, is a verse in the Earth's grand design.
'''

In [16]:
## Regression inference for a given word:
classifier.textClassifier_regression(input_subjective)

--- 
In the realm of Earth's vast embrace, where landscapes whisper ancient tales, geodimensioning becomes an art, a poetic dance with space and time. It's here, amidst the cartographer's dream, that maps unfurl like canvases, painting the world in strokes of truth and mystery combined.

Scale, the cartographer's muse, weaves tales of distances vast and small, where a single line can span mountains, and a dot encloses cities' sprawl. It's a delicate balance, a harmony sought, between the expanse of the world and the parchment's thought.

Precision, the sculptor of detail, carves the world with a fine-edged tool, where every measurement is a verse, and every data point a jewel. It's the pursuit of perfection, a never-ending quest, to capture the Earth's essence, at its very best.

Resolution, the seer's gift, reveals the world in grains of sand, where hidden stories come to light, and nothing is too small to stand. It's the clarity of vision, the ability to see, the intricate patterns o

In [24]:
## Multiclass classification for a given word:
classifier.textClassifier_multiclass('happyness')

--- SCA: "happyness" has Latent content.
Model used: Model_02_E_regularized


In [34]:
## Regression inference for a given sentence:
classifier.textClassifier_regression('Your sample sentence')

--- Your sample sentence:
38.72 of objectivity
15.49 of subjectivity


In [18]:
classifier.textClassifier_regression('I am working')

--- I am working:
83.07 of objectivity
56.62 of subjectivity
