Monosemantic Search

We take the visualization interface from Anthropic's Towards Monosemanticity: Decomposing Language Models With Dictionary Learning and make it 80x+ faster.

Check it out here!

How does this work?

We first scrape all of the data from Anthropic. Then index all the tokens we want to search using redis. Redis then allows for extremely fast retrieval.

We then created a backend using python and flask that allows us to retreive data really quickly with the redis indexs.

Finally, all of this is presented with our NextJS frontend.

The biggest optimization here is the redis indexing. Redis allows us to make our search numerous times faster than the current search at Anthropic's visualization page.

The only problem is that sorting all of the data in memory is expensive. The redis DB is about 3.6GB in size.

Dev & Production

First get all the data. This is done from the scraper folder. It's all already scraped. You can use that data as well.

To scrape all of the data run

python3 main.py

The go we must index all of the data on redis. Make sure you have a redis instance that can handle 4GB of data.

From the server folder:

pip3 install requirements.txt

then

python3 indexing.py

then run the flask server, app.py.

Finally, to run the frontend, go to the frontend folder.

npm install

then build the web app

npm run build

finally start it

npm run start

Acknowledgement

This work was created by Mustafa Aljadery & Siddharth Sharma.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
frontend		frontend
node_modules		node_modules
scraper		scraper
server		server
.DS_Store		.DS_Store
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frontend

frontend

node_modules

node_modules

scraper

scraper

server

server

.DS_Store

.DS_Store

README.md

README.md

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

Monosemantic Search

How does this work?

Dev & Production

Acknowledgement

About

Releases

Packages

Languages

mustafaaljadery/monosemanticity

Folders and files

Latest commit

History

Repository files navigation

Monosemantic Search

How does this work?

Dev & Production

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages