Tools to construct and process webgraphs from Common Crawl data
-
Updated
Oct 2, 2024 - Java
Tools to construct and process webgraphs from Common Crawl data
A sample application that demonstrates how to build a graph processing platform to analyze sources of emotional influence on Twitter.
Projects done in the Cloud Computing course.
Search Engine projects
A distributed algorithm applied to the bitcoin blockchain that allows to create a new representation of the transaction - a clusterized graph that combines all the addresses belonging to the same owner/organization.
Search Engine for Books (Java, Apache Lucene, crawler4j, Apache Spark)
Link ranking with Apache Giraph for Apache Nutch
Coursework for CS550 : Massive Data Mining. Topics covered include Map-Reduce, Association Rules, Frequent Itemsets, Locality-Sensitive Hashing (LSH), Singular Value Decomposition (SVD), Page Rank, k-means, Modularity, Spectral Clustering, Clique-based communities, Clustering Data Streams.
Command line tool to compute PageRank scores over RDF graphs
Applies Elasticsearch and Google's PageRank algorithm to search UML models
Service providing a summarisation service for entities in RDF graphs
pagerank hadoop
Use PageRank algorithm and InversePageRank to get the PageRank value and InversePageRank value of each website, and sort them from largest to smallest. Then select the number of normal websites and spam websites in the first N websites, and display them visually
Using MapReduce to calculate Wikipedia page rank; preventing dead-ends and spider-traps
A page rank implementation on top of Neo4j
Add a description, image, and links to the pagerank topic page so that developers can more easily learn about it.
To associate your repository with the pagerank topic, visit your repo's landing page and select "manage topics."