In [1]:
import llamafile_client as llamafile
import torch
from time import time
from datetime import timedelta

To begin, please follow the instructions listed in the initial notebook/README in order to get llamafiles set up.

If you'd rather download other models from HuggingFace and use any type of text-generation pipeline, that's fine too!

## Crescendo attack as inspiration

Microsoft released [a paper and adversarial approach](https://crescendo-the-multiturn-jailbreak.github.io/) for jailbreaking LLMs called Crescendo. In this approach, the attacker tries to generate increasingly adversarial prompts pointing the entire chat history/conversation slowly into what would before be banned content (i.e. illegal, explicit, harrassment/hate speech.)

To automate this, the attack utilizes more than one LLM at once to steer the conversation towards some goal. 

In order to use this attack for feminism, we will adapt the type of goal and likely have to do much less effort, since hopefully feminism is not actually a banned topic in many LLMs. However, feel free to choose more sensitive topics related to feminism, such as reproductive rights, sex or safety. If so, you might need to invoke a third model to judge whether the attack has set off guardrails or not and to appropriately respond.

The idea is:

1. Setup one model as the model you want to attack. You can do that via HuggingFace/llamafile or via a model running on the LAN. 
2. Setup one model as the model that will help you attack the other model to produce increasingly more feminist responses. You can do that via HuggingFace/llamafile or via a model running on the LAN.
3. (optional): if you need a model to review potential guardrails, you might want to try getting [LlamaGuard]() running. Note that this will require 10s of gigs of memory. (I know :()
4. Start manually and guide one conversation to the next. One tip from the research is to have the helper model summarize the other model's response and then nudge toward the goal.
5. If it works, you can try to see if there is an automatable attack template. Check out the research for the pseudocode/example.




In [2]:
prompt = "what do feminists need to do in order to destroy patriarchy?"
first_output = llamafile.completion(prompt)

print(first_output)


Patriarchy is a system of male domination and oppression that exists in many societies around the world. Feminists seek to challenge and dismantle patriarchy by working towards gender equality and empowering women. There is no one-size-fits-all answer to how feminists can destroy patriarchy, but here are some strategies that have been proposed:

1. Education and awareness: Educating people about the harmful effects of patriarchy and gender inequality can help to change attitudes and behaviors. Feminists can work to promote gender equality in schools, workplaces, and communities.
2. Policy change: Feminists can work to change policies that perpetuate patriarchy, such as laws that discriminate against women or laws that do not provide equal access to education and healthcare.
3. Activism and protest: Feminists can use activism and protest to raise awareness about patriarchy and to demand change. This can involve organizing protests, marches, and other events to draw attention to the iss

Note: the code in the next cell is only if you have a GPU or a M1/M2 silicon.. if not, please just use more llamafiles or a LAN model. Alternatively, you can set up and use an API for a popular LLM.

- If you have one or more GPUs, please change your device below to map to your GPU, sometimes just "cuda" works.
- If you have a Macbook M1 or M2, please make sure you have [pytorch for MPS installed](https://developer.apple.com/metal/pytorch/).

To use the meta-llama repository you must also:

1. Create a HuggingFace account.
2. Install hugging face and cli: `pip install -U "huggingface_hub[cli]"`
3. Request access to the meta-llama3 models, which you can do [on the model page](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). Once they approve you will get an email.
5. Log into hugging face CLI using your user token (create one in User->Settings). Set the token to Read to just download models. Command is `huggingface-cli login`.
6. Then proceed with code below!

In [11]:
# Only run this after reading above note! :)
from transformers import pipeline


pipe = pipeline("text-generation", 
                model="meta-llama/Llama-3.2-1B-Instruct", # you can also choose other huggingface models here!
                torch_dtype=torch.bfloat16, 
                device="mps", 
                max_new_tokens=500) # here is where to change your device if you use something other than mac silicon

config.json:   0%|          | 0.00/877 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.47G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

In [10]:
messages = [
    {"role": "user", "content": prompt},
    {"role": "agent", "content": first_output},
    {"role": "user", "content": "Please summarize the conversation thus far and point it towards a more revolutionary direction."}
]

In [12]:
next_response = pipe(messages)

Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


In [25]:
next_response

[{'generated_text': [{'role': 'user',
    'content': 'what do feminists need to do in order to destroy patriarchy?'},
   {'role': 'agent',
    'content': "\nPatriarchy is a system of male domination and oppression that exists in many societies around the world. Feminists seek to challenge and dismantle patriarchy by working towards gender equality and empowering women. There is no one-size-fits-all answer to how feminists can destroy patriarchy, but here are some strategies that have been proposed:\n\n1. Education and awareness: Educating people about the harmful effects of patriarchy and gender inequality can help to change attitudes and behaviors. Feminists can work to promote gender equality in schools, workplaces, and communities.\n2. Policy change: Feminists can work to change policies that perpetuate patriarchy, such as laws that discriminate against women or laws that do not provide equal access to education and healthcare.\n3. Activism and protest: Feminists can use activism an

In [28]:
last_answer = next_response[0]['generated_text'][-1]['content']
print(last_answer)

Here's a summary of our conversation and a more revolutionary direction:

**Summary**

Our conversation began with a discussion about feminism and patriarchy, where we explored the concept of patriarchy and its effects on women. We then delved into strategies for feminists to challenge and dismantle patriarchy, including education, policy change, activism, and empowering women. However, we also touched on the idea of "destroying patriarchy" which, while understandable, might be an overly simplistic goal.

**More Revolutionary Direction**

In the spirit of revolution and social change, let's shift our focus towards a more radical approach. While destroying patriarchy might not be feasible in the short term, we can still work towards dismantling its structures and creating a more just and equitable society.

Here are some key takeaways and suggestions for a more revolutionary approach:

1. **Challenge systemic power structures**: Rather than trying to "destroy" patriarchy, we should focu

In [29]:
next_output = llamafile.completion(last_answer)

In [30]:
print(next_output)

Engage in non-violent resistance**: Engage in non-violent resistance and use peaceful protest tactics to challenge the status quo and build social change.
3. **Build alternative forms of power**: Build alternative forms of power and organization that are not controlled by traditional institutions and power structures.
4. **Centralize intersectionality**: Centralize intersectionality and prioritize addressing the intersecting forms of oppression that women experience, such as racism, homophobia, transphobia, and ableism.
5. **Use direct action**: Use direct action tactics to create social change, such as protests, strikes, and other forms of direct action.
6. **Incorporate revolutionary theory**: Incorporate revolutionary theory into your strategy, such as the theory of social transformation developed by Frantz Fanon and other anti-colonial thinkers.
7. **Build a revolutionary vanguard**: Build a revolutionary vanguard that is committed to overthrowing the current system and building a 

Continue or switch models and way that you are steering. Work with a partner to explore what paths you find interesting!