You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently working on using the Universal Sentence Encoder Lite model for getting text embeddings for our corpus to find similar items. I am facing an issue that if I have a very long paragraph and remove a sentence from its end then the 2 embeddings before and after removing the sentence are exactly same. Here is what I have tried
sent1 = """No one is ever ready for an emergency but you can be prepared When you know where to get information have the right supplies and have a plan for you your loved ones and your pets you can protect yourself and your family before a crisis and for at least 72hours afterwards The Homeland Security and Emergency Management Agency HSEMA iPhone iPad application contains important information you can use before during and after an emergency or disaster such as Emergency evacuation routes that lead you out of the District Alert DC emergency text alerts Current weather outlooks from the National Weather Service Disaster safety tips Help lines that provide telephone numbers to essential emergency resources and information A calendars informing the public about emergency preparedness training HSEMA Community Outreach events as well as special events such as marathons and street festivals A direct link to the local transit authority s METRO main website and twitter page List of shelters that are opened after a disaster occurs A direct link to FEMA s website Maps of where District Police and Fire stations are located Regional preparedness links Steps to take to make a family emergency plan a go kit and much more The tools in this app help ensure that no matter where you are or what you are doing you ll be prepared The app is free to download through your iPhone and iPad provider s app store"""
sent2 = """No one is ever ready for an emergency but you can be prepared When you know where to get information have the right supplies and have a plan for you your loved ones and your pets you can protect yourself and your family before a crisis and for at least 72hours afterwards The Homeland Security and Emergency Management Agency HSEMA iPhone iPad application contains important information you can use before during and after an emergency or disaster such as Emergency evacuation routes that lead you out of the District Alert DC emergency text alerts Current weather outlooks from the National Weather Service Disaster safety tips Help lines that provide telephone numbers to essential emergency resources and information A calendars informing the public about emergency preparedness training HSEMA Community Outreach events as well as special events such as marathons and street festivals"""
messages = [sent1, sent2]
# Get the embeddings for sent1 & sent2
values, indices, dense_shape = process_to_IDs_in_sparse_format(sp, messages)
with tf.Session() as session:
session.run([tf.global_variables_initializer(), tf.tables_initializer()])
message_embeddings = session.run(
encodings,
feed_dict={input_placeholder.values: values,
input_placeholder.indices: indices,
input_placeholder.dense_shape: dense_shape})
# Find cosine similarity between sent1 & sent2
from scipy import spatial
1 - spatial.distance.cosine(message_embeddings[2], message_embeddings[3])
>>> 1.0
Is there a word length limit for the universal sentence encoder lite model. I also tried a universal sentence encoder model 4 & 5, but I didn't find this issue using these models.
Also when will be the new version of the lite model will be coming that will work on tensorflow 2.0
The text was updated successfully, but these errors were encountered:
Hello,
I am currently working on using the Universal Sentence Encoder Lite model for getting text embeddings for our corpus to find similar items. I am facing an issue that if I have a very long paragraph and remove a sentence from its end then the 2 embeddings before and after removing the sentence are exactly same. Here is what I have tried
Is there a word length limit for the universal sentence encoder lite model. I also tried a universal sentence encoder model 4 & 5, but I didn't find this issue using these models.
Also when will be the new version of the lite model will be coming that will work on tensorflow 2.0
The text was updated successfully, but these errors were encountered: