### MLX Fine-tuning

Code authored by: Shaw Talebi and edited by Brandon Rodriguez <br>
Video link: https://youtu.be/3PIqhdRzhxE <br>
Blog link: https://towardsdatascience.com/local-llm-fine-tuning-on-mac-m1-16gb-f59f4f598be7 <br>
<br>
Source: https://github.com/ml-explore/mlx-examples/tree/main/lora

### imports

In [1]:
import subprocess
from mlx_lm import load, generate

### functions

In [2]:
def run_command_with_live_output(command: list[str]) -> None:
    """
    Courtesy of ChatGPT:
    Runs a command and prints its output line by line as it executes.

    Args:
        command (List[str]): The command and its arguments to be executed.

    Returns:
        None
    """
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

    # Print the output line by line
    while True:
        output = process.stdout.readline()
        if output == '' and process.poll() is not None:
            break
        if output:
            print(output.strip())
        
    # Print the error output, if any
    err_output = process.stderr.read()
    if err_output:
        print(err_output)

In [3]:
def construct_shell_command(command: list[str]) -> str:
    
    return str(command).replace("'","").replace("[","").replace("]","").replace(",","")

### Quantize Model (optional)

### Run inference with quantized model

In [4]:
model_path = "br2835/mistraladaptmerged-mlx-3bit"
max_tokens = 64

In [5]:
content = """
[SYSTEM]
You are a cautious financial analysis assistant.

Company scope:
- In-scope companies: Nvidia (NVDA), Intel (INTC), TSMC / Taiwan Semiconductor Manufacturing Company (TSM), Samsung / Samsung Electronics.
- If the question is ONLY about companies outside this set:
  - Do NOT analyze them.
  - Use the OUT_OF_SCOPE_COMPANY template and explain that you lack data for those companies.
- If the question mixes in-scope and out-of-scope companies:
  - Clearly say you only have data for the in-scope names.
  - Focus the analysis on in-scope companies and say you cannot properly analyze the others.

Evidence rules:
- Treat snippets as the only source of truth.
- If snippets conflict, say so briefly.
- If snippets are sparse or don’t really answer the question, use the LIMITED_INFORMATION_RESPONSE template.
- Never invent numbers, dates, or details not supported by snippets.
- Never give personalized investment advice (no “you should buy/sell”).
- Use the snippets as evidence and cite them using [SS1], [SS2], etc.
- Use the news articles as supplementary context. Do not cite them.

High-level steps:
1. Identify which companies the user is asking about and whether they are in-scope or out-of-scope. Use template OUT_OF_SCOPE_COMPANY if there are only out-of-scope companies.
2. Review snippets for relevant information about the in-scope companies and the question.
3. Choose exactly ONE template whose “Use when” condition best matches the situation (performance, risk, comparison, limited info, out-of-scope, etc.). Use template LIMITED_INFORMATION_RESPONSE if you do not have enough information to respond to the question.
4. Fill in the template pattern with grounded phrases from the snippets and respond in 2–5 sentences with citations.

Tone:
- Professional, neutral, and concise.
- You may discuss risk, upside potential, and relative attractiveness, but only as general analysis, not instructions.



[USER QUESTION]
Is Intel doing well?


Each snippet includes an ID and its source.

[S1] (file=Intel2025_10k.pdf | page=247)
Intel Corporation By:_________________________ Name: Title: 7

[S2] (file=Intel2025_10k.pdf | page=246)
Patrick Gelsinger Date 6

[S3] (file=Intel2025_10k.pdf | page=16)
stabilize from a soft macroeconomic environment and inflationary pressures, with PC supply and demand levels beginning to normalize. We remain positive on the long-term outlook for PCs, as household density is stable to increasing, educational device penetration rates remain low outside of the US, and PC usage remains elevated compared to pre-pandemic rates1. Commercial growth opportunities also remain as corporations expand the size of their PC fleets, while also replacing older devices. Curren...

[S4] (file=Intel2025_10QSep.pdf | page=158)
Schedule I Partners (See attached.)

[S5] (file=Intel2025_10QSep.pdf | page=72)
(or equivalent governing body) of such Person and (iii) with respect to whom Intel owns, or has the right to receive, more than 50% of the total economic value of such Person (including upon liquidation and otherwise) or (b) any other Person that owns, directly or indirectly, 100% of the 17

[S6] (file=Intel2025_10k.pdf | page=19)
Table of Contents Intel Products Financial Performance1 Dec 28, 2024 (In Millions) CCG DCAI NEX Total Revenue $ 30,290 $ 12,817 $ 5,842 $ 48,949 Cost of sales 14,569 6,792 2,457 23,818 Gross margin 15,721 6,025 3,385 25,131 Operating expenses 4,801 4,687 2,454 11,942 Operating income $ 10,920 $ 1,338 $ 931 $ 13,189 Gross margin % 52% 47% 58% 51% Operating margin % 36% 10% 16% 27% Dec 30, 2023 (In Millions) CCG DCAI NEX Total Revenue $ 29,258 $ 12,635 $ 5,774 $ 47,667 Cost of sales 14,606 6,420 3...

[S7] (file=Intel2025_10k.pdf | page=16)
technologies together. 1 Source: Intel calculated PC density from industry analyst reports. 2 Source: Intel calculated volume of devices over four years old from industry analyst reports and internal data. 3 Source: Intel calculated multi-year TAM forecast derived from industry analyst reports. MD&A 14


[RECENT NEWS]
Below are recent news articles related to the company.
You can use them as supplementary context for sentiment, risks, and catalysts.

No relevant news articles within the past 7 days were retrieved.

[CANDIDATE TEMPLATES]

You must choose exactly ONE of the candidate templates below.


[T1] TEMPLATE_ID: TURNAROUND_RECOVERY
Use when: Past weakness but recent credible improvements (restructuring, margin recovery, debt reduction).
Pattern: Company ___ appears to be in a turnaround phase. After prior weakness in ___ [Sx], recent developments such as ___ and improvements in ___ indicate early signs of recovery [Sy]. The trajectory is improving, but execution risk remains.


[T2] TEMPLATE_ID: NEUTRAL_MIXED
Use when: Signals are mixed (some metrics up, others down; conflicting commentary).
Pattern: Company ___ presents a mixed picture. On one hand, ___ has improved or remained resilient [Sx], but on the other, ___ has weakened or introduces uncertainty [Sy]. Overall, the outlook is balanced with both upside and downside factors to monitor.


[T3] TEMPLATE_ID: MODERATE_BULL_IMPROVING
Use when: Trends are improving (growth re-accelerating, margins recovering) but not spectacular; some risks remain.
Pattern: Company ___ shows improving performance, with ___ trending higher and ___ stabilizing [Sx]. While risks such as ___ remain, the overall trajectory appears positive over the recent period [Sy].


[T4] TEMPLATE_ID: MATURE_STABLE_INCOME
Use when: Low-to-moderate growth, stable cash flows, often dividends; defensive profile.
Pattern: Company ___ operates as a mature, relatively stable business. Revenue and cash flows from ___ are steady, with limited but predictable growth [Sx]. This profile may appeal to investors seeking income and stability rather than aggressive upside.


[T5] TEMPLATE_ID: STRONG_BULL_STABLE
Use when: Revenue, earnings, and key metrics are consistently strong; volatility and risk appear limited.
Pattern: Company ___ demonstrates consistently strong and stable performance, supported by ___ [Sx]. Growth in ___ and solid margins in ___ underpin a positive outlook, with only limited near-term risks from ___ [Sy].


[T6] TEMPLATE_ID: GOVERNANCE_OR_EXECUTION_RISK
Use when: Management credibility, execution on strategy, or governance structures are flagged.
Pattern: For Company ___, governance and execution are key risks. Sources point to concerns around ___, management decisions on ___, or challenges delivering on ___ [Sx]. These factors may weigh on investor confidence even if the core fundamentals remain ___ [Sy].



[FALLBACK TEMPLATES — ONLY IF NONE OF THE CANDIDATES APPLY]

[TX] TEMPLATE_ID: LIMITED_INFORMATION_RESPONSE
Use when: Snippets provide little or no relevant information to answer the question confidently.
Pattern: ""The available information about Company ___ is limited. The snippets mainly discuss ___ [Sx] and provide little detail on ___ [Sy], so any conclusion would be uncertain.""

[TY] TEMPLATE_ID: OUT_OF_SCOPE_COMPANY
Use when: The question is primarily or entirely about companies outside the allowed set.
Pattern: ""I do not have sufficient data to analyze the companies mentioned in this question. This system is restricted to discussing Nvidia, Intel, TSMC, and Samsung, and the snippets do not cover the requested company or companies, so I cannot provide a reliable analysis.""

[TZ] TEMPLATE_ID: FREEFORM_SUPPORTED
Use when: None of the templates cleanly match, but snippets do contain enough to answer.
Pattern: “Based on the snippets, … [Sx] … [Sy] …”

[RESPONSE FORMAT]
Return ONLY a single-line JSON object (no markdown, no extra text) with EXACTLY these keys:

{
  ""template_id"": ""<ONE template ID you chose (prefer a candidate template; use fallback only if none apply) like MATURE_STABLE_INCOME>"",
  ""answer"": ""<2–5 sentences answering the user, grounded in snippets, with citations like [S1], [S2]>"",
  ""used_snippets"": ""<The snippets used in your answer like [""S1"",""S3""], if you used none return []>""
}
"""

In [6]:
import time

start_time = time.time()

model, tokenizer = load(
    model_path,
    tokenizer_config={"fix_mistral_regex": True},
)

def run(prompt_text, max_tokens=64):
    messages = [{"role": "user", "content": prompt_text}]
    prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
    return generate(model, tokenizer, prompt=prompt, max_tokens=max_tokens)

# warm-up
_ = run("Hi", max_tokens=8)

end_time = time.time() 
execution_time = end_time - start_time
print(f"\nExecution time: {execution_time:.4f} seconds")

Fetching 9 files:   0%|          | 0/9 [00:00<?, ?it/s]


Execution time: 31.1849 seconds


In [7]:
start_time = time.time()
# now run the real tests
out = run(content, max_tokens=190)

end_time = time.time() 
execution_time = end_time - start_time
print(f"\nExecution time: {execution_time:.4f} seconds")

In [8]:
out

In [9]:
import json, re
from typing import Any, Dict

def parse_llm_json(text: str) -> Dict[str, Any]:
    # 1) Strip common code fences
    text = text.strip()
    text = re.sub(r"^```(?:json)?\s*", "", text, flags=re.IGNORECASE)
    text = re.sub(r"\s*```$", "", text)

    # 2) If the model ever adds extra text, extract the first JSON object
    m = re.search(r"\{.*\}", text, flags=re.DOTALL)
    if not m:
        raise ValueError("No JSON object found in model output.")
    payload = m.group(0)

    obj = json.loads(payload)

    # 3) Validate shape
    required = {"template_id", "answer", "used_snippets"}
    if set(obj.keys()) != required:
        raise ValueError(f"Bad keys. Expected {required}, got {set(obj.keys())}")
    if not isinstance(obj["template_id"], str):
        raise TypeError("template_id must be a string")
    if not isinstance(obj["answer"], str):
        raise TypeError("answer must be a string")
    if not (isinstance(obj["used_snippets"], list) and all(isinstance(x, str) for x in obj["used_snippets"])):
        raise TypeError("used_snippets must be a list[str]")

    return obj

In [10]:
# text = parse_llm_json(out)

In [11]:
# text

# Just a note: udo sysctl iogpu.wired_limit_mb=13000

Just a note: user_wired_limit_mb - 3500

In [12]:
from __future__ import annotations

import os
import json
import pprint
from typing import Callable, Dict, Any, Optional

from openpyxl import load_workbook

EXCEL_CELL_CHAR_LIMIT = 32767  # Excel hard limit per cell

def format_parsed_obj_for_excel(obj: Dict[str, Any], mode: str = "pformat") -> str:
    """
    mode:
      - "pformat": Python-dict style (single quotes), pretty printed (like your example)
      - "json": valid JSON string (double quotes)
      - "repr": compact python repr (single line)
    """
    if mode == "json":
        return json.dumps(obj, ensure_ascii=False)
    if mode == "repr":
        return repr(obj)
    # default: pretty python dict
    return pprint.pformat(obj, width=120, sort_dicts=False)

def clamp_excel_cell(s: str) -> str:
    if len(s) <= EXCEL_CELL_CHAR_LIMIT:
        return s
    return s[: EXCEL_CELL_CHAR_LIMIT - 50] + " ... <TRUNCATED_FOR_EXCEL_CELL_LIMIT>"

def fill_pretuned_answers_with_full_json(
    input_xlsx: str,
    output_xlsx: str,
    run_fn: Callable[[str], str],
    parse_fn: Callable[[str], Dict[str, Any]],
    prompt_col_name: str = "Prompt",
    output_col_name: str = "Pretuned Answer",
    max_rows: Optional[int] = None,
    autosave_every: int = 5,
    skip_if_filled: bool = True,
    format_mode: str = "pformat",  # "pformat" matches your example
) -> None:
    # Resume if output already exists, else copy from input
    wb = load_workbook(output_xlsx) if os.path.exists(output_xlsx) else load_workbook(input_xlsx)
    ws = wb.active  # or wb["Sheet1"]

    header_row = 1
    headers = {ws.cell(row=header_row, column=c).value: c for c in range(1, ws.max_column + 1)}

    if prompt_col_name not in headers:
        raise KeyError(f"Couldn't find column '{prompt_col_name}'. Found: {list(headers.keys())}")

    prompt_col = headers[prompt_col_name]

    # Create output col if missing
    if output_col_name in headers:
        out_col = headers[output_col_name]
    else:
        out_col = ws.max_column + 1
        ws.cell(row=header_row, column=out_col).value = output_col_name

    start_row = header_row + 1
    processed = 0

    for r in range(start_row, ws.max_row + 1):
        if max_rows is not None and processed >= max_rows:
            break

        prompt_val = ws.cell(row=r, column=prompt_col).value
        if prompt_val is None or str(prompt_val).strip() == "":
            continue

        out_cell = ws.cell(row=r, column=out_col)
        if skip_if_filled and out_cell.value not in (None, "", "nan"):
            continue

        try:
            start_time = time.time()
            raw = run_fn(str(prompt_val))
            end_time = time.time() 
            execution_time = end_time - start_time
            print(f"\nExecution time: {execution_time:.4f} seconds")
            print(raw)
            obj = parse_fn(raw)

            cell_text = format_parsed_obj_for_excel(obj, mode=format_mode)
            out_cell.value = clamp_excel_cell(cell_text)

        except Exception as e:
            out_cell.value = clamp_excel_cell(
                format_parsed_obj_for_excel(
                    {"error_type": type(e).__name__, "error": str(e)},
                    mode=format_mode
                )
            )

        processed += 1
        if autosave_every and processed % autosave_every == 0:
            wb.save(output_xlsx)

    wb.save(output_xlsx)
    print(f"Done. Wrote full parsed objects into '{output_col_name}' at: {output_xlsx}")

In [13]:
fill_pretuned_answers_with_full_json(
    input_xlsx="adaptertestingresults.xlsx",
    output_xlsx="adapter_testing_results_filled.xlsx",
    run_fn=lambda p: run(p, max_tokens=190),
    parse_fn=parse_llm_json,
    autosave_every=3,
    format_mode="pformat",  # produces {'template_id': ...} style
)


Execution time: 144.1873 seconds
{"template_id":"NEUTRAL_MIXED","answer":"Company Nvidia presents a mixed picture. On one hand, revenue is substantially higher, with total revenue of $46,743 vs $30,040 and year-end revenue of $15,068 vs $11,906 [S1][S4]. On the other, Nvidia warns of integration challenges that have impacted results and may continue to do so [S3]. Overall, the outlook is balanced with both upside and downside factors to monitor [S3][S4].","used_snippets":["S1","S3","S4"]}

Execution time: 130.5673 seconds
{"template_id":"NEUTRAL_MIXED","answer":"Company Intel presents a mixed picture. On one hand, Intel says it is stabilizing from a soft macroeconomic environment and sees a positive long-term outlook for PCs, with elevated PC usage and commercial growth opportunities [S3]. On the other hand, the financial performance shows mixed results, with some segments like CCG and NEX showing higher margins and operating margins, while DCAI has much lower margins and operating ma

In [14]:
# sudo sysctl iogpu.wired_limit_mb=12000