On architectures that allow unaligned loads, we should rewrite b == c1 && b == c2 to *(*uint16)(&b) == (c1 << 8) + c2. And do all architectures for cases in which we know b is appropriately aligned. Also uint32 and uint64 (as appropriate).
See CL 26758 for more discussion; that CL will cause many of these to be generated.
We might also independently want to update the front end (near OCMPSTR) to generate the larger comparisons directly.