Tests #14

kirawi · 2021-07-19T02:59:20Z

I'm having some trouble with a few tests, so could you possibly take a look at what I need to do? I'll also take a look myself tomorrow. The results should be in the CI. I would guess around 200-300 of the fails are because of this:
[uni1A45|uni1A32@592,0|uni1A5B.ratha_nohost@1523,0|uni1A69@1523,0]
versus
[uni1A45|uni1A321A5B@592,0|uni1A69@1184,-734]
but I'm not sure how to do this.

(The code is very terrible right now, it'll be much better by merging time).

Generator:

```rs use std::{ fmt::{self, Display}, fs::File, io::{BufWriter, Write}, mem, path::PathBuf, };

#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct ParseError;

impl fmt::Display for ParseError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "missing closing quote")
}
}

impl std::error::Error for ParseError {}

enum State {
/// Within a delimiter.
Delimiter,
/// After backslash, but before starting word.
Backslash,
/// Within an unquoted word.
Unquoted,
/// After backslash in an unquoted word.
UnquotedBackslash,
/// Within a single quoted word.
SingleQuoted,
/// Within a double quoted word.
DoubleQuoted,
/// After backslash inside a double quoted word.
DoubleQuotedBackslash,
/// Inside a comment.
Comment,
InPath,
}

pub fn split(s: &str) -> Result<Vec, ParseError> {
use State::*;

let mut words = Vec::new();
let mut word = String::new();
let mut chars = s.chars();
let mut state = Delimiter;

loop {
    let c = chars.next();
    state = match state {
        Delimiter => match c {
            None => break,
            Some('\'') => SingleQuoted,
            Some(c @ '[') => {
                word.push(c);
                SingleQuoted
            }
            Some('\"') => DoubleQuoted,
            Some('\\') => Backslash,
            Some('\t') | Some(' ') | Some('\n') | Some(':') | Some(';') => Delimiter,
            Some('#') => Comment,
            Some(c) => {
                word.push(c);
                Unquoted
            }
        },
        Backslash => match c {
            None => {
                word.push('\\');
                words.push(mem::replace(&mut word, String::new()));
                break;
            }
            Some('\n') => Delimiter,
            Some(c) => {
                word.push(c);
                Unquoted
            }
        },
        Unquoted => match c {
            None => {
                words.push(mem::replace(&mut word, String::new()));
                break;
            }
            Some('\'') => SingleQuoted,
            Some(c @ '[') => {
                word.push(c);
                SingleQuoted
            }
            Some('\"') => DoubleQuoted,
            Some('\\') => UnquotedBackslash,
            Some('\t') | Some(' ') | Some('\n') | Some(':') | Some(';') => {
                words.push(mem::replace(&mut word, String::new()));
                Delimiter
            }
            Some(c @ '/') => {
                word.push(c);
                InPath
            }
            Some(c) => {
                word.push(c);
                Unquoted
            }
        },
        UnquotedBackslash => match c {
            None => {
                word.push('\\');
                words.push(mem::replace(&mut word, String::new()));
                break;
            }
            Some('\n') => Unquoted,
            Some(c) => {
                word.push(c);
                Unquoted
            }
        },
        SingleQuoted => match c {
            None => return Err(ParseError),
            Some(c @ ']') => {
                word.push(c);
                Unquoted
            }
            Some('\'') => Unquoted,
            Some(c) => {
                word.push(c);
                SingleQuoted
            }
        },
        DoubleQuoted => match c {
            None => return Err(ParseError),
            Some('\"') => Unquoted,
            Some('\\') => DoubleQuotedBackslash,
            Some(c) => {
                word.push(c);
                DoubleQuoted
            }
        },
        DoubleQuotedBackslash => match c {
            None => return Err(ParseError),
            Some('\n') => DoubleQuoted,
            Some(c @ '$') | Some(c @ '`') | Some(c @ '"') | Some(c @ '\\') => {
                word.push(c);
                DoubleQuoted
            }
            Some(c) => {
                word.push('\\');
                word.push(c);
                DoubleQuoted
            }
        },
        Comment => match c {
            None => break,
            Some('\n') => Delimiter,
            Some(_) => Comment,
        },
        InPath => match c {
            None => break,
            Some(';') => {
                words.push(mem::replace(&mut word, String::new()));
                Delimiter
            }
            Some(c) => {
                word.push(c);
                InPath
            }
        },
    }
}

Ok(words)

}

#[derive(Default, Debug)]
struct TestCase {
name: String,
path: PathBuf,
font_size: usize,
features: Vec<(String, u16)>,
variations: Vec<(String, f32)>,
input: Vec,
output: Vec,
}

impl Display for TestCase {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
if self.output.contains(&"*".to_string()) {
write!(
f,
"create_test!({}, {:?}, {}, &{:?}, &{:?}, &{:?});",
self.name,
self.path.to_str().unwrap(),
self.font_size,
self.features,
self.variations,
self.input,
)
} else {
write!(
f,
"create_test!({}, {:?}, {}, &{:?}, &{:?}, &{:?}, &{:?});",
self.name,
self.path.to_str().unwrap(),
self.font_size,
self.features,
self.variations,
self.input,
self.output,
)
}
}
}

impl TestCase {
pub fn parse(arguments: Vec) -> Self {
let mut case = Self::default();
case.font_size = 75;

    for arg in arguments.iter() {
        if arg.starts_with("../") {
            // Path
            case.path = arg.into();
        } else if arg.starts_with("--") {
            // Features/Variations
            let arg = arg.strip_prefix("--").unwrap();
            if arg.contains("=") {
                // Variation
                let split: Vec<_> = arg.split("=").collect();
                let variation: (String, f32) = (
                    split.first().unwrap().to_string(),
                    split.last().unwrap().parse().unwrap_or(0.0),
                );
                case.variations.push(variation);
            } else {
                // Feature
                case.features.push((arg.to_string(), 1));
            }
        } else if arg.starts_with("[") || arg == "*" {
            // Output
            let output = arg.trim_matches(|chr| chr == '[' || chr == ']');
            for chr in output.split('|') {
                // if chr.contains('=') {
                    // Failed attempt at parsing runs/positions.
                    // let mut split = chr.split('=');
                    // let unicode = split.next().unwrap().to_string();
                    // let mut split_for_offsets = split.next().unwrap().split('+');
                    // let x = split_for_offsets.next().unwrap().parse().expect(chr);
                    // let y = split_for_offsets.next().unwrap().parse().unwrap();
                    // case.output.push((unicode, x, y));
                // } else {
                    case.output.push(chr.to_string());
                // }
            };
        }
    }

    let input = arguments
        .get(
            arguments
                .len()
                .checked_sub(2)
                .unwrap_or_else(|| panic!("{:#?}", arguments)),
        )
        .unwrap();
    for chr in input.split(',') {
        let chr = chr.trim_start_matches("U+");
        let parsed_chr: char =
            char::from_u32(u32::from_str_radix(chr, 16).expect(chr)).expect(chr);
        case.input.push(parsed_chr);
    }

    case
}

}

fn main() -> std::io::Result<()> {
let input = PathBuf::from("data");
let output = PathBuf::from("output");
let rendering_tests = input.join("text-rendering-tests");
let aot_tests = input.join("aots");
let inhouse_tests = input.join("in-house");
std::fs::create_dir_all(output.clone())?;

if inhouse_tests.exists() {
    let directory = std::fs::read_dir(inhouse_tests.join("tests"))?
        .flatten()
        .filter(|file| file.file_type().unwrap().is_file());
    let mut file = BufWriter::new(File::create(output.join("text-rendering-tests.txt"))?);

    // Iterate over each file in the directory and generate tests.
    for test in directory {
        let content = std::fs::read_to_string(test.path())?;
        for line in content.lines() {
            // file.write(format!("{:#?}\n", split(line).unwrap()).as_bytes())?;
            let val = split(line).unwrap();
            if val.is_empty() {
                continue;
            }
            let formatted_string = format!("{}\n", TestCase::parse(val));
            file.write(formatted_string.as_bytes())?;
        }
    }
} else {
    panic!()
}

Ok(())

}

</p>
</details>

dfrg · 2021-07-19T04:51:00Z

I thought I had kept the code consistently formatted, but that appears to have not been the case. I'm going to go ahead and push a mass formatting commit to remove all the noise from the PR.

It seems that a lot of the failures may be due to missing script information. I'll address that in a review in a moment.

tests/shaping_impl.rs

kirawi · 2021-07-19T05:16:47Z

Yep, I've applied those. The remaining fails appear to be legitimate bugs, some bug on my part with runs, another bug on my part with opening the font, and the aforementioned issue where it doesn't seem to "merge" certain glyphs (referenced up above). I'm not sure if the last problem is just because I'm missing something in the test harness implementation.

dfrg · 2021-07-19T05:40:46Z

Some of these are tests of ancient unused cmap formats that I'm unlikely to support. There also appear to be tests that should probably pass but are missing glyph names for some reason. A couple are showing a y-offset that doesn't match.

kirawi · 2021-07-19T05:41:33Z

I (painfully) manually went through the failing tests, and there appear to be only 134 legitimate ones. Some may still be errors on my part, such as off by one errors (possibly rounding), but I included them just in case.

Notably, gpos, hvar, shknda, shlana (though Harfbuzz themselves has all that disabled because of a problem they had as well), and morx.

kirawi · 2021-07-19T05:50:29Z

Do you want to disable shlana as well? I'm not sure what it is, but 125/209 fail.

kirawi · 2021-07-19T05:53:39Z

With the tests I've disabled, the ratio is now 361:148. 108 are the errors I noted as being legitimate. Pretty good! By the way, I included glyph name issues as legitimate errors as well, so the actual number of "legitimate" errors may be significantly lower, possibly ~30 less.

dfrg · 2021-07-19T05:53:47Z

If HarfBuzz isn't testing against it, then probably best to remove it for now.

dfrg · 2021-07-19T05:56:15Z

This looks great so far! Thanks so much for taking the time.

kirawi · 2021-07-19T06:01:27Z

Of note:

#[test]
fn morx_24_0() {
    assert_eq!(
        shape("fonts/TestMORXTwentyfour.ttf", 0, &[], "ABCDE"),
        "*"
    )
}

seems to cause an endless loop. I've disabled it for now.

dfrg · 2021-07-19T06:06:17Z

So the gvar tests are odd. Looks like it's failing to load the fonts outright, so it might be using the old Apple style true header word or something. I'm not sure what's going on in those tests, but if they're expecting to extract mark offsets from contour points, then those will definitely be ignored as I don't have any intention of loading/scaling/hinting outlines during layout.

The hvar tests are good from what I can see. The interpolation is likely to be slightly different depending on implementation of the item variation store code (floating point vs fixed point), so off-by-one errors are to be expected.

The morx tests that panic with OOB in the glyph buffer are very troubling, so I'll likely prioritize those.

tests/shaping_impl.rs

kirawi · 2021-07-19T06:19:47Z

So I think for the rounding errors, what I think we could do is use a tuple struct over the vec used in shape(), and then implement the compare trait and allow it to be equal within <1 of error. We might want to identify what is actually rounding error or a logic bug first before that though.

dfrg · 2021-07-19T06:24:29Z

Agreed with rounding errors. I'd rather leave them in for the moment to make sure they're not errors in the lookups.

dfrg · 2021-07-19T06:33:58Z

So for RTL runs, it looks like it has the proper glyph order but incorrect offsets. This is going to be tricky because HarfBuzz simply handles this differently. It does a full logical reverse of the glyph buffer and provides appropriate offsets (so marks precede bases). The swash shaper maintains logical order and provides per-cluster mark offsets with the intention that clusters will be reversed but not glyphs. There is a mapping between the two but I can't think of a trivial change to the shape function right now.

dfrg · 2021-07-19T11:18:14Z

Some fixes pushed to master should correct the failed font loading in the gvar tests. I believe adding .insert_dotted_circles(true) while building the shaper should also fix most of the Balinese tests. The RTL tests will require more work and may need some changes to the shaper. I'm going to have to give some more thought to this.

Co-authored-by: Chad Brokaw <cbrokaw@gmail.com>

* support up to 256 features per stage * only apply nested contextual lookup to specified index

* Return success from matching contextual subtables regardless of nested lookup results

* Push potentially intermediate ligatures back onto stack in ligature subtable * Handle END_OF_TEXT in insertion subtable * Correct advance count in insertion subtable

kirawi · 2021-09-07T01:44:39Z

... I'm going to create a new PR, I have no idea what happened.

kirawi and others added 10 commits July 18, 2021 22:56

wip

e9b51a4

remove fonts

5f916b6

add fonts

2bb06af

fix

5aec738

fix

e087e45

fmt

38d8cb7

fix

1195391

fix

8a112dc

runs? not sure if correct

401c056

fix

5edf48f

dfrg marked this pull request as ready for review July 19, 2021 04:55

dfrg marked this pull request as draft July 19, 2021 04:55

script

77cfad1

dfrg reviewed Jul 19, 2021

View reviewed changes

tests/shaping_impl.rs Outdated Show resolved Hide resolved

dfrg reviewed Jul 19, 2021

View reviewed changes

tests/shaping_impl.rs Outdated Show resolved Hide resolved

dfrg reviewed Jul 19, 2021

View reviewed changes

tests/shaping_impl.rs Outdated Show resolved Hide resolved

dfrg and others added 27 commits September 6, 2021 21:43

use better script

d65a16f

remove unic-bidi

f7e6191

fix variable name

7db08e8

disable tests

84e1486

remove disabled tests

6b3d1e5

Update tests/shaping_impl.rs

8dcf58b

Co-authored-by: Chad Brokaw <cbrokaw@gmail.com>

Update tests/shaping_impl.rs

2cb4184

Co-authored-by: Chad Brokaw <cbrokaw@gmail.com>

add comments

f327da3

fix

375b4df

wip

098380e

rustfmt

c77bd89

Handle 'true' font header word

2cd5ee2

Ensure broken clusters are marked for last token

40914ee

Don't prevent reordering across a virama that has been substituted

0fa11b7

Expose basic glyph info in shaper

42b467c

Ensure minimal effective length of a cluster

faac628

Add option to retain default ignorables in the shaper

dccdade

Special case some Kannada codepoints in the USE

69a1e8e

Remove code that caused regressions in complex cluster parser

5e39b6f

Be less restrictive about adding advance width to glyphs with mark class

bbb1b15

Allow morx insertion at end of buffer

bc7fac4

Clamp iterations to prevent infinite loops in AAT tables

559a517

Attempted fix for dfrg#15

7554585

* support up to 256 features per stage * only apply nested contextual lookup to specified index

Fix long ligatures in Fira Code for dfrg#15

50d2b20

* Return success from matching contextual subtables regardless of nested lookup results

Fixes for AAT substitutions re: dfrg#17

c6c9fa7

* Push potentially intermediate ligatures back onto stack in ligature subtable * Handle END_OF_TEXT in insertion subtable * Correct advance count in insertion subtable

morx contextual subtable fixes

ec64bfa

Merge branch 'master' of github.com:dfrg/swash into tests

4346928

kirawi force-pushed the tests branch from f672ac4 to 4346928 Compare September 7, 2021 01:43

kirawi closed this Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests #14

Tests #14

kirawi commented Jul 19, 2021 •

edited

Loading

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 •

edited

Loading

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 •

edited

Loading

kirawi commented Jul 19, 2021

kirawi commented Jul 19, 2021 •

edited

Loading

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 •

edited

Loading

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Sep 7, 2021

Tests #14

Tests #14

Conversation

kirawi commented Jul 19, 2021 • edited Loading

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 • edited Loading

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 • edited Loading

kirawi commented Jul 19, 2021

kirawi commented Jul 19, 2021 • edited Loading

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Jul 19, 2021 • edited Loading

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

dfrg commented Jul 19, 2021

kirawi commented Sep 7, 2021

kirawi commented Jul 19, 2021 •

edited

Loading

kirawi commented Jul 19, 2021 •

edited

Loading

kirawi commented Jul 19, 2021 •

edited

Loading

kirawi commented Jul 19, 2021 •

edited

Loading

kirawi commented Jul 19, 2021 •

edited

Loading