<a href="https://colab.research.google.com/github/Brunozml/artistotllm/blob/main/avg_sentence_length.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# prompt: take the previous code and only keep the sentence length. take out word and paragraph analysis

import re
import requests
from typing import Dict, Tuple

def calculate_average_sentence_length(text: str) -> Dict[str, float]:
    """Calculate average sentence length"""
    # Clean and prepare text
    text = text.strip()

    # Sentence analysis
    sentences = re.split(r'[.!?]+', text)
    sentences = [s.strip() for s in sentences if s.strip()]
    # Count words in each sentence
    sentence_word_counts = [len(re.findall(r'\b\w+\b', s)) for s in sentences]
    avg_sentence_length = sum(sentence_word_counts) / len(sentences) if sentences else 0

    return {
        "avg_sentence_length": avg_sentence_length,
        "total_sentences": len(sentences)
    }

def compare_sentence_lengths(text1: str, text2: str) -> Tuple[Dict, Dict, Dict]:
    """Compare average sentence lengths between two texts"""
    stats1 = calculate_average_sentence_length(text1)
    stats2 = calculate_average_sentence_length(text2)

    # Calculate differences
    differences = {
        "sentence_length_diff": stats2["avg_sentence_length"] - stats1["avg_sentence_length"]
    }

    return stats1, stats2, differences

def read_file(filepath):
    """Read text from a file (local or URL) and return its contents"""
    if filepath.startswith('http://') or filepath.startswith('https://'):
        try:
            response = requests.get(filepath)
            response.raise_for_status()
            return response.text
        except requests.exceptions.RequestException as e:
            print(f"Error fetching URL {filepath}: {e}")
            return None
    else:
        try:
            with open(filepath, 'r') as f:
                return f.read()
        except FileNotFoundError:
            print(f"Error opening local file {filepath}: File not found")
            return None

if __name__ == "__main__":
    # File paths
    data_path = 'https://raw.githubusercontent.com/Brunozml/artistotllm/main/data/raw/'
    file1 = 'gpt_what_to_do.txt'
    file2 = 'hypewrite_what_to_do.txt'

    # Read files
    text1 = read_file(data_path + file1)
    text2 = read_file(data_path + file2)

    if text1 is not None and text2 is not None:
        # Compare texts
        stats1, stats2, differences = compare_sentence_lengths(text1, text2)

        # Print results
        print(f"Text 1: {file1}")
        print(f"   Average sentence length: {stats1['avg_sentence_length']:.1f} words")
        print(f"   Total sentences: {stats1['total_sentences']}")

        print(f"\n📄 Text 2: {file2}")
        print(f"   Average sentence length: {stats2['avg_sentence_length']:.1f} words")
        print(f"   Total sentences: {stats2['total_sentences']}")

        print(f"\n🔍 Differences (Text 2 - Text 1):")
        print(f"   Sentence length: {differences['sentence_length_diff']:+.1f} words")
    else:
        print("Failed to read one or both files")


📄 Text 1: gpt_what_to_do.txt
   Average sentence length: 14.3 words
   Total sentences: 19

📄 Text 2: hypewrite_what_to_do.txt
   Average sentence length: 10.6 words
   Total sentences: 54

🔍 Differences (Text 2 - Text 1):
   Sentence length: -3.7 words


In [3]:

text1_manual = "What should one do? That may seem a strange question, but it's not meaningless or unanswerable. It's the sort of question kids ask before they learn not to ask big questions. I only came across it myself in the process of investigating something else. But once I did, I thought I should at least try to answer it.So what should one do? One should help people, and take care of the world. Those two are obvious. But is there anything else? When I ask that, the answer that pops up is Make good new things.I can't prove that one should do this, any more than I can prove that one should help people or take care of the world. We're talking about first principles here. But I can explain why this principle makes sense. The most impressive thing humans can do is to think. It may be the most impressive thing that can be done. And the best kind of thinking, or more precisely the best proof that one has thought well, is to make good new things.I mean new things in a very general sense. Newton's physics was a good new thing. Indeed, the first version of this principle was to have good new ideas. But that didn't seem general enough: it didn't include making art or music, for example, except insofar as they embody new ideas. And while they may embody new ideas, that's not all they embody, unless you stretch the word \"idea\" so uselessly thin that it includes everything that goes through your nervous system.Even for ideas that one has consciously, though, I prefer the phrasing \"make good new things.\" There are other ways to describe the best kind of thinking. To make discoveries, for example, or to understand something more deeply than others have. But how well do you understand something if you can't make a model of it, or write about it? Indeed, trying to express what you understand is not just a way to prove that you understand it, but a way to understand it better.Another reason I like this phrasing is that it biases us toward creation. It causes us to prefer the kind of ideas that are naturally seen as making things rather than, say, making critical observations about things other people have made. Those are ideas too, and sometimes valuable ones, but it's easy to trick oneself into believing they're more valuable than they are. Criticism seems sophisticated, and making new things often seems awkward, especially at first; and yet it's precisely those first steps that are most rare and valuable.Is newness essential? I think so. Obviously it's essential in science. If you copied a paper of someone else's and published it as your own, it would seem not merely unimpressive but dishonest. And it's similar in the arts. A copy of a good painting can be a pleasing thing, but it's not impressive in the way the original was. Which in turn implies it's not impressive to make" #@param {type:"string"}
text2_manual = "copies of good things. It's not impressive to make a copy of a good novel, or a good song, or a good scientific discovery. The only thing that's impressive is to make something new, something that didn't exist before. And that's true not just in science and the arts, but in all areas of life. A new business, a new product, a new service - all of these are new things, and all of them require the same kind of thinking that makes good new things." #@param {type:"string"}

if text1_manual and text2_manual:
  # Compare texts
  stats1, stats2, differences = compare_sentence_lengths(text1_manual, text2_manual)

  # Print results
  print(f"Text 1 (Manual Input):")
  print(f"   Average sentence length: {stats1['avg_sentence_length']:.1f} words")
  print(f"   Total sentences: {stats1['total_sentences']}")

  print(f"\n📄 Text 2 (Manual Input):")
  print(f"   Average sentence length: {stats2['avg_sentence_length']:.1f} words")
  print(f"   Total sentences: {stats2['total_sentences']}")

  print(f"\n🔍 Differences (Text 2 - Text 1):")
  print(f"   Sentence length: {differences['sentence_length_diff']:+.1f} words")
else:
  print("Please enter text for both inputs to compare.")

Text 1 (Manual Input):
   Average sentence length: 14.1 words
   Total sentences: 37

📄 Text 2 (Manual Input):
   Average sentence length: 17.8 words
   Total sentences: 5

🔍 Differences (Text 2 - Text 1):
   Sentence length: +3.7 words
