Skip to content

jessica1438/Information-Retreival-using-Speech-Data-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Information-Retreival-using-Speech-Data-

NLP and Information Extraction Expertise:

Led a comprehensive NLP project focused on extracting valuable insights from spoken content in YouTube videos. Applied state-of-the-art speech-to-text algorithms and APIs for data collection, followed by meticulous text preprocessing techniques. Employed a multifaceted approach for information extraction, including entity recognition, rule-based extraction, and dependency parsing, showcasing versatility in NLP methodologies. Leveraged both spaCy and Hugging Face Transformers libraries for entity recognition, adapting to a range of NLP tools and techniques.

Dependency Parsing and Linguistic Analysis:

Conducted in-depth dependency parsing using spaCy, revealing the grammatical intricacies of sentences and uncovering syntactic relationships between words. Visualized dependency parse trees to provide intuitive representations of syntactic dependencies, aiding in the identification of key syntactic relationships within the spoken discourse.

Knowledge Graph Construction and Database Integration:

Designed and implemented a directed knowledge graph representing entities, their labels, and relationships extracted from transcribed speech. Developed a Python script to persist the constructed knowledge graph into a Neo4j graph database, facilitating further analysis and querying. Enhanced data interpretability through visualizations and database integration, bridging the gap between raw content and actionable insights.

Comprehensive Information Extraction:

Employed rule-based approaches for entity recognition, including the identification of quantity-related information and specific events, enhancing the granularity of information extraction. Successfully identified and extracted detailed information about brakes, vehicle models, manufacturers, driving modes, noise levels, and design opinions through pattern matching and contextual cues. Generated a structured repository of extracted information, providing valuable insights for each entry in the transcribed speech, contributing to a holistic understanding of the content.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published